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CHAPTER  1 


Introduction 

Microwave  receivers  play  a  vital  role  in  Electronic  Warfare  (EW)  environments  for  passive  identification 
and  localization  of  unknown  targets  emitting  high-frequency  electro-magnetic  signals.  All  presently  available 
microwave  receivers  utilize  analog  signal  processing  tools  and  techniques  [1-2,  3].  Microwave  signals  have  very 
high  frequency  content  and  have  wide  bandwidths.  As  of  now,  there  are  no  EW  receivers  that  process  microwave 
radar  signals  entirely  in  the  digital  domain.  It  is  expected  that,  with  the  emergence  of  increasingly  faster  and 
inexpensive  digital  computers  and  high-speed  A/D  converters,  digital  processing  of  microwave  signals  would  most 
certainly  be  the  way  of  the  future  [4,  5,  6].  One  of  the  main  purpose  of  this  project  is  to  complement  the  research 
on  Digital  Microwave  Receiver  Design  under  progress  at  the  EW  Laboratory  at  WPAFB,  Dayton,  Ohio. 

In  addition  to  the  digital  receiver  design  problem,  some  fundamental  theoretical  aspects  of  several  classical 
System  Identification  problems  are  also  being  studied  as  part  of  this  project.  In  particular,  a  unified  framework 
is  proposed  for  optimal  estimation  of  rational  transfer  function  parameters  from  prescribed  Time-Domain  or 
Frequency-Domain  specifications.  This  powerful  unifying  theoretical  framework  for  System  Identification  appear 
to  have  remained  mostly  unrecognized  and  unutilized.  Apart  from  the  digital  EW  receiver  design  problem,  the 
proposed  theoretical  foundation  is  expected  to  have  a  broad  range  of  applications  in  rational  modeling. 

Significant  progress  has  already  been  made  during  the  first  year  of  research  on  this  Project.  Several  research 
problems  of  current  interest  have  been  addressed  and  solved  satisfactorily.  Most  of  the  new  results  were  proposed 
in  the  original  proposal,  but  some  results  are  being  developed  as  the  needs  arise  at  the  Wright  Labs.  This  annual 
report  contains  the  details  of  all  the  results  of  the  research  that  have  been  accomplished  so  far.  The  importance 
of  any  research  may  perhaps  be  best  judged  by  the  quality  of  publications  it  produces.  A  significant  amount 
of  time  was  also  spent  on  preparing  Journal  and  Conference  articles  based  on  the  research  results.  Most  of 
the  results  contained  in  this  report  have  been  published/accepted/presented  in  internationally  recognized,  top 
quality  Signal  Processing  Journals/Conferences  and  few  are  under  review/preparation  for  future  publication.  The 
papers/publications  ensuing  from  this  research  are  listed  at  the  end  of  this  introductory  Chapter. 

The  research  conducted  under  this  project  can  be  categorized  primarily  into  two  broad  themes,  viz., 

(i)  Digital  EW  Receiver  Design  Problems  :  The  problems  addressed  are  as  follows  : 

(a)  A  high-resolution  method  for  AOA  estimation  using  Minimum-Norm  Method  that  does  not  rely  on  any 
Eigendecomposition 

(b)  A  high-resolution  Maximum-Likelihood  method  for  frequency  estimation  that  guarantees  unit-circle 
roots 

(c)  Two  methods  for  superior  estimation  of  AR  and  ARMA  parameters  when  the  observation  data  is  noisy 

(d)  Time-Domain  algorithms  for  detection  of  Electronic  Warfare  Signals  in  the  presence  of  Noise 

(ii)  Optimal  System  Identification  Problems  : 

(a)  Optimal  identification  of  1-D  Rational  Systems  from  Input-Output  Data 

(b)  Optimal  identification  of  1-D  Rational  Systems  in  the  Frequency  Domain 

(c)  Optimal  Identification  of  All-Pole  Rational  Systems  in  Time-Domain 

(d)  Design  of  Denominator  Separable  2-D  IIR  Filters 
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The  report  is  organized  as  follows:  In  Chapter  2,  the  research  results  on  Digital  EW  receiver  design  related 
problems  are  reported  whereas  in  Chapter  3,  the  System  Identification  areas  are  covered  in  complete  details. 
Individual  Chapters  are  divided  into  several  Sections  by  topics.  In  the  following  paragraphs  the  main  results 
obtained  in  these  Sections  are  outlined  very  briefly. 

CHAPTER  2.  THE  Digital  MICROWAVE  RECEIVER  DESIGN  PROBLEM 

Section  -  2.1  :  Minimum-Norm  Method  Without  Eigendecomposition  :  Many  existing  high-resolution 
methods,  such  as  MUSIC  or  Minimum-Norm  Method,  must  rely  on  special-purpose  hardware  or  software  for 
obtaining  the  signal  and  noise  subspace  eigenvectors  of  Autocorrelation  (AC)  matrices.  We  have  proposed  a  new 
DFT-based  high-resolution  frequency  estimation  algorithm  which  does  not  require  any  eigendecomposition  and 
hence,  it  is  much  less  computation  intensive.  It  is  demonstrated  that  the  DFT  of  the  AC  matrix  (DFT-of-AC) 
essentially  performs  an  equivalent  task  of  separating  the  signal  and  noise  subspaces.  Furthermore,  when  the 
signal-subspace  part  of  the  DFT-of-AC  vectors  are  used  in  MNM,  almost  identical  high-resolution  AOA  estimates 
are  produced.  The  results  will  be  extended  to  2-D  case.  The  preliminary  results  have  been  accepted  as  a  Journal 
paper  and  an  ICASSP  paper.  It  may  be  noted  here  that  according  to  one  of  the  anonymous  reviewers  of  the 
journal  paper,  this  work  is  a  “significant  breakthrough  in  source  localization” . 

Section  -  2.2  :  Maximum-Likelihood  Method  with  Exact  Constraints :  A  recently  proposed  approximate 
Maximum-Likelihood  Estimator  (MLE)  of  multiple  exponentials,  converts  the  frequency  estimation  problem  into 
a  problem  of  estimating  the  coefficients  of  a  z-polynomial  with  roots  at  the  desired  frequencies.  Theoretically, 
the  roots  of  the  estimated  polynomial  should  fall  on  the  unit  circle.  But  MLE,  as  originally  proposed,  does  not 
guarantee  unit  circle  roots.  This  drawback  sometimes  causes  merged  frequency  estimates,  especially  at  low  SNR. 
If  all  the  sufficient  conditions  for  the  z-polynomial  to  have  unit  circle  roots  are  incorporated,  the  optimization 
problem  becomes  too  nonlinear  and  it  loses  the  desirable  weighted-quadratic  structure  of  MLE.  In  this  work, 
the  exact  constraints  are  imposed  on  each  of  the  lst-order  factors  corresponding  to  individual  frequencies  for 
ensuring  unit  circle  roots.  The  constraints  are  applied  during  optimization  alternately  for  each  frequency.  In  the 
absence  of  any  merged  frequency  estimates,  the  RMS  values  more  closely  approach  the  theoretical  Cramer-Rao 
(CR)  bound  at  low  SNR  levels.  The  work  has  been  accepted  for  publication  as  Journal  paper  [13]  and  it  is  also 
under  preparation  for  consideration  as  Conference  paper  [14].  The  results  will  be  extended  to  2-D  case. 

Section  -  2.3  :  Improved  A R- Parameter  Estimation  From  Noisy  Observation  Data  :  Auto-Regressive 
(AR)  modeling  is  the  most  widely  used  approach  for  model-based  spectrum  estimation.  But  almost  all  the 
existing  methods  for  AR-parameter  estimation  show  severe  degradation  if  the  observed  signal  is  corrupted  with 
noise.  In  fact,  all  the  commonly  used  techniques,  such  as,  Autocorrelation  Method  (AM),  Covariance  Method 
(CM),  Modified  Covariance  Method  and  their  variations,  give  poor  Power  Spectral  Density  (PSD)  estimates 
when  the  observations  are  noisy.  In  this  part  of  the  project,  a  data-adaptive  pre-filtering  approach  is  presented 
to  address  this  problem.  Preliminary  results  indicate  that  when  only  noisy  data  is  available  for  modeling,  the 
proposed  technique  gives  more  accurate  PSD  estimates  than  the  commonly  used  methods.  A  conference  paper 
on  this  work  have  been  accepted  [15]  and  a  more  comprehensive  version  is  under  consideration  for  publication  as 
a  Journal  paper  [16]. 

Section  -  2.4  :  Improved  ARMA-Parameter  Estimation  From  Noisy  Observation  Data  :  Existing 
methods  for  ARM  A  modeling  assume  that  the  available  process  is  produced  by  an  ARM  A  system  driven  by  a 
white  input  process,  t.e.,  the  observed  process  is  considered  to  be  pure  ARMA.  In  practice,  the  available  data 
usually  have  observation  noise  added  to  it  but  the  ARMA  methods  do  rot  address  this  problem.  Simulations 
show  that  performance  of  the  existing  ARMA  methods  deteriorate  when  the  observation  process  is  noisy.  In  this 
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part  of  the  project  a  new  ARMA  algorithm  is  given  which  utilizes  a  recently  developed  deterministic  rational 
system  identification  method  (OM-IO)  [8]  that  minimizes  the  modeling  or  output  error  norm.  The  algorithm  first 
estimates  the  input  process  and  then  invokes  OM-IO  using  the  input-output  data.  Simulations  indicate  that  the 
proposed  method  is  quite  effective  even  at  low  SNR  observation  data.  A  conference  paper  on  this  work  have  been 
accepted  [17]  and  a  more  comprehensive  version  is  under  preparation  for  publication  as  a  Journal  paper. 

Section  -  2.5  :  Time-Domain  Detection  of  Electronic  Warfare  Signals  in  Noise  :  Almost  all  existing 
AOA/RF  estimation  algorithms  assume  that  the  signal  is  already  present  in  the  observed  data.  But  in  the  passive 
mode  of  operations  of  EW  applications,  source  signals  may  not  be  present  at  all  within  the  observation  window, 
or  the  signals  may  fill  only  a  part  of  the  estimation  window.  In  either  case,  any  frequency  estimation  algorithm 
would  essentially  produce  erroneous  or  noise  frequencies  because  the  observed  signal  would  not  satisfy  the  model 
assumed  by  the  estimation  algorithm.  Considering  the  relatively  high  computational  burden,  any  estimation 
method  should  be  invoked  only  when  a  detection  scheme  indicates  high  probability  of  presence  of  threat.  In 
this  part  of  the  project,  the  theory  of  detecting  sinusoids  from  Quantized  and  Noisy  time-domain  observation 
samples  have  been  developed.  The  theoretical  work  on  single/multiple  samples  is  mostly  complete.  Studies  with 
Quantized  data  have  also  been  performed  and  the  results  appear  reasonably  good.  Lab  tests  for  the  Envelope 
Detection  and  Square-Law  cases  have  been  conducted  at  Wright  Labs  with  satisfactory  results. 

CHAPTER  3.  OPTIMAL  SYSTEM  IDENTIFICATION  PROBLEMS 

Fundamental  contributions  have  been  made  in  1-D  Rational  System  Identification  theory.  Three  key  journal 
papers  have  been  published/accepted  and  a  number  of  conference  publications  have  been  generated.  The  proposed 
comprehensive  framework  encompasses  a  large  class  of  Identification  problems  from,  (a)  Input-Output  data  [18, 
19],  (b)  Impulse  Response  Data  :  AR  case  [110,  Ill],  ARMA  case  [112]  and  (c)  Frequency  Response  Data  [113, 
114]. 

Section  -  3.1  :  Identification  of  1-D  Rational  Systems  from  Input-Output  Data  :  A  theoretical  and 
algorithmic  framework  is  proposed  for  optimal  identification  of  rational  transfer  function  parameters  of  discrete- 
time  linear  systems  from  Input-Output  (IO)  data.  The  nonlinear  criterion  is  theoretically  decoupled  into  a  purely 
linear  problem  for  estimating  the  optimal  numerator  and  a  nonlinear  problem  for  the  optimal  denominator.  The 
proposed  decoupled  approach  has  reduced  computational  requirements  when  compared  to  existing  algorithms 
which  estimate  the  parameters  simultaneously.  This  research  has  led  to  one  Journal  paper  [18]  and  a  Conference 
paper  [19].  More  work  is  under  way  to  generalize  the  results  to  2D  and  Multivariable  system  identification. 

Section  -  3.2  :  Identification  of  1-D  Rational  Systems  in  the  Frequency  Domain  :  A  new  Frequency- 
Domain  (FD)  approach  is  presented  for  optimal  estimation  of  rational  transfer  functions  coefficients.  The  proposed 
method  seeks  to  match  any  arbitrarily-shaped  FD  specifications  in  the  Least-Squares  (LS)  sense.  The  desired 
specifications  may  be  arbitrarily  spaced  in  frequency.  The  design  is  performed  directly  in  the  digital  domain  and 
no  analog  to  digital  transformation  is  necessary.  The  proposed  method  makes  use  of  the  inherent  mathematical 
structure  in  this  rational  modeling  problem  to  theoretically  decouple  the  numerator  and  denominator  estimation 
problems  into  two  smaller  dimensional  problems.  The  denominator  criterion  is  nonlinear  but  possesses  a  weighted- 
quadratic  structure  which  is  convenient  for  iterative  optimization.  The  optimal  numerator  is  found  linearly  by 
solving  a  set  of  simultaneous  equations.  The  decoupled  criteria  retain  the  global  optimality  properties.  The 
performance  of  the  algorithm  is  demonstrated  with  some  simulation  examples.  This  research  has  led  to  one 
Journal  paper  [113]  and  a  Conference  paper  [114].  More  work  is  under  way  to  generalize  the  results  to  2D 
and  Multivariable  system  identification.  Further  work  will  also  look  into  the  use  of  DFT  values  for  System 
Identification. 
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Section  -  3.3  :  Identification  of  All- Pole  Rational  Systems  in  Time-Domain  :  An  algorithm  is  proposed 
for  optimal  estimation  of  the  parameters  of  Auto-Regressive  (AR)  or  all-pole  transfer  function  models  from 
prescribed  impulse  response  data.  The  transfer  function  coefficients  are  estimated  by  minimizing  the  fa-norm  of 
the  exact  model  fitting  error.  Existing  methods  either  minimize  equation  errors  or  modify  the  true  non-linear 
fitting  error  criterion.  In  the  proposed  method,  the  multidimensional  nonlinear  error  criterion  has  been  decoupled 
into  a  purely  linear  and  a  nonlinear  subproblem.  Global  optimality  properties  of  the  decoupled  estimators 
have  been  established.  For  data  corrupted  with  Gaussianly  distributed  noise,  the  proposed  method  produces 
Maximum-Likelihood  Estimates  (MLE)  of  the  AR-parameters.  The  inherent  mathematical  structure  in  the  non¬ 
linear  subproblem  is  exploited  in  formulating  an  efficient  iterative  computational  algorithm  for  its  minimization. 
The  proposed  algorithm  provides  an  useful  computational  tool  based  on  appropriate  theoretical  foundation  for 
accurate  modeling  of  all-pole  systems  from  prescribed  impulse  response  data.  The  effectiveness  of  the  algorithm 
has  been  demonstrated  with  several  simulation  examples.  This  research  has  led  to  one  Journal  paper  [Ill]  and  a 
Conference  paper  [110].  More  work  is  under  way  to  generalize  the  results  to  2D-AR  and  Multivariable- AR  system 
identification. 

Section  -  3.4  :  Design  of  Denominator  Separable  2-D  HR  Filters  :  This  work  extends  the  1-D  results 
in  [112]  to  2-D  system  identification.  In  this  part  of  the  report,  the  optimal  design  of  an  important  class  of 
two-dimensional  (2-D)  digital  IIR  filters  from  spatial  impulse  response  data  is  addressed.  The  denominator  of 
the  desired  2-D  filter  is  assumed  to  be  separable  into  two  1-D  factors.  The  filter  coefficients  are  estimated 
by  minimizing  the  ^-norm  of  the  error  between  the  prescribed  and  the  estimated  spatial  domain  responses. 
The  denominator  and  numerator  estimation  problems  are  theoretically  decoupled  into  separate  problems.  The 
decoupled  criteria  have  reduced  dimensionality.  The  denominator  criterion  is  simultaneously  optimized  w.r.t.  the 
coefficients  in  both  dimensions  using  an  iterative  algorithm.  The  numerator  coefficients  are  found  in  a  straight¬ 
forward  manner.  If  the  desired  response  is  known  to  be  symmetric,  the  proposed  algorithm  can  be  constrained 
to  Preliminary  results  has  been  accepted  for  publication  as  a  Journal  paper  [115].  The  results  will  be  further 
extended  for  Input-Output  and  Frequency  Domain  data. 
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CHAPTER  2 


THE  DIGITAL  MICROWAVE  RECEIVER  DESIGN  PROBLEMS 


Introduction 

Digital  processing  of  microwave  signals  in  Electronic  Warfare  (EW)  environment  poses  a  great  challenge  to 
researchers  in  Signal  Processing.  Along  with  the  standard  requirements  of  any  conventional  radar,  EW  receiver 
design  problem  is  complicated  by  the  fact  that  no  knowledge  about  the  input  signal  is  available  to  the  receiver. 
The  nature  of  the  problem  also  requires  that  measurements  and  decisions  be  taken  immediately  or  within  a 
few  seconds  in  an  entirely  passive  mode  of  operation.  All  microwave  receivers  used  in  practice  utilize  analog 
signal  processing  techniques.  The  frequency-band  of  the  EW  signals  are  in  the  GHz  range  and  the  signals  have 
wide  bandwidths  which  necessitate  sampling  and  processing  of  a  massive  amount  of  data  at  or  near  real-time. 
Presently,  no  EW  receiver  processes  microwave  radar  signals  entirely  in  the  digital  domain.  But  it  is  expected 
that  with  the  emergence  of  increasingly  faster  and  inexpensive  digital  computers  and  high-speed  A/D  converters, 
digital  processing  of  microwave  signals  would  most  certainly  be  the  way  of  the  future. 

In  the  past  two  decades,  many  classes  of  radar  and  sonar  receivers  have  been  converted  from  conventional 
analog  technology  to  purely  digital  or  hybrid  systems,  but  EW  receivers  are  yet  to  make  such  a  transition.  The 
primary  technological  factors  that  have  been  holding  back  possible  fabrication  of  any  digital  EW  receiver  are 
probably  twofold.  Firstly,  if  Analog-to-Digital  (A/D)  converters  are  to  be  used  at  the  operating  frequency  range, 
then  the  Nyquist  rate  would  necessitate  sampling  at  the  GHZ  range  and  secondly,  the  digital  hardware  or  firmware 
must  have  the  capacity  to  process  such  high  data  rate  and  produce  effective  results  at  near  real-time. 

Digital  EW  receivers  can  be  expected  to  offer  some  major  advantages  over  their  analog  counterparts.  Foremost 
among  these  is  the  almost  lossless  storage  capability  of  digital  memories  which  can  eliminate  the  dependence  on 
lossy  analog  delay  lines.  Digital  processors  and  memory  chips  are  relatively  inexpensive,  compact  in  size  and  have 
low  weight  and  the  trends  are  towards  even  further  reductions.  Digital  signal  processing  algorithms  and  digital 
computing  technology  have  matured  tremendously  and  offer  a  wide  range  of  capabilities.  Parallel  processing, 
pipelining,  RISC,  VLSI  design,  systolic  architecture,  vectorization  and  array  processing,  fault  tolerant  computing 
and  etc.,  are  only  some  of  the  well-known  aspects  of  digital  computing  that  the  last  few  decades  of  research  have 
produced.  As  our  research  progresses,  we  intend  to  study  if  some  of  these  ideas  can  be  incorporated  in  the  digital 
receiver  in  order  to  improve  the  efficiency  and  accuracy  of  its  performance. 

The  primary  task  of  a  microwave  receiver  is  to  gather  data  for  sorting  of  signals  and  for  identifying  the 
radar-type.  Based  on  these  information  jamming,  weapon  delivery  or  other  options  are  considered.  In  order  to 
perform  these  tasks,  the  receiver  must  analyze  the  received  radar  pulses  and  measure  or  estimate  the  following 
six  parameters  :  Angle-of-Arrival  (AOA),  Radio  Frequency  (RF),  Time  of  Arrival  (TOA),  Pulse  Amplitude  (PA), 
Pulse  Width  (PW)  and  Polarization  (P). 

A  critical  requirement  of  an  EW  receiver  is  the  AOA  measurement  which  is  known  to  be  a  rather  difficult 
multidimensional  nonlinear  optimization  problem,  especially  when  multiple  closely-spaced  threats  are  to  be  re¬ 
solved.  It  is  also  desirable  to  have  high  sensitivity  and  large  dynamic  range  such  that  a  broad  range  of  signals, 
including  weak  ones,  can  be  detected. 

As  part  of  this  project  several  AOA/frequency  estimation  algorithms  has  been  developed  and  studied.  Most 
existing  high-resolution  frequency-estimation  algorithms  rely  on  special-purpose  hardware  or  software,  such  as, 
Eigendecomposition  or  SVD.  In  Section  2.1,  a  DFT-based  Minimum-Norm  method  is  proposed  which  does  not 
require  any  eigendecomposition  but  produces  high-resolution  frequency  estimates.  This  new  algorithm  needs 
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only  to  compute  the  DFT  of  the  Autocorrelation  matrix  to  separate  the  signal  and  noise  subspaces.  Hence  the 
computational  burden  is  much  lower  than  existing  high-resolution  methods.  Therefore,  this  algorithm  appears  to 
be  very  well-suited  for  FW  applications. 

Another  new  class  of  algorithms,  referred  to  as  KiSS/IQML,  have  been  developed  recently  for  obtaining 
the  Maximum  Likelihood  Estimates  (MLE)  of  frequencies  or  AOAs  from  the  roots  of  z-polynomials.  But  the 
estimated  roots  are  not  guaranteed  to  fall  on  the  unit  circle,  as  desired.  Based  on  the  theory  on  leros  of 
polynomials,  a  new  scheme  is  proposed  in  Section  2.2  here  that  will  ensure  unit  circle  roots.  Many  frequency 
estimation  methods  make  use  of  the  property  that  a  sinusoidal  process  can  be  modeled  as  a  limiting  case  of  a 
narrow-band  auto-regressive  (AR)  process.  But  the  performance  of  all  existing  AR  parameter  estimation  methods 
degrade  significantly  when  the  observation  data  is  corrupted  with  noise.  A  pre-filtering  approach  is  presented 
in  Section  2.3  that  can  improve  AR-parameter  estimates  from  noisy  observation  data.  In  Section  Another  data- 
adaptive  approach  for  improved  modeling  of  ARMA  processes  from  noisy  observations  is  given  in  Section  2.4. 

Parameter  estimation  schemes  either  follow  or  work  in  parallel  with  a  detection  scheme  ensuring  the  presence 
of  any  threat.  A  combined  detection-estimation  scheme  has  the  potential  to  cut-down  computational  burden  on 
the  signal  processor.  As  a  part  of  this  project,  statistical  theory  on  hypothesis  testing  has  been  utilized  for 
detecting  whether  a  threat  is  present  or  not.  In  Section  2.5,  the  time-domain  detection  problem  has  been 
presented  for  single  and  multiple  samples.  Specifically,  the  detection  thresholds  and  Probability  of  Detection 
based  on  Neyman-Pearson  Criterion  have  been  derived. 
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Section  -  2.1  :  MINIMUM-NORM  METHOD  WITHOUT  ElGENDECOMPOSITION  (D-MNM) 


SUMMARY 

In  many  important  practical  applications,  such  as  radar,  sonar  and  astronomy  etc.,  the  resolution  capability  of 
FFT  is  inadequate.  Overcoming  the  resolution  limitation  of  DFT  has  been  a  vigorously  researched  topic  in  Signal 
Processing  in  the  past  three  decades.  The  modern  methods  attain  the  desired  ‘High- Resolution’  or  ‘Superresolu¬ 
tion’  at  the  cost  of  steep  computational  burden.  The  existing  well-known  methods  utilize  Eigen- Decomposition 
(ED),  Singular  Value  Decomposition  (SVD)  or  Maximum  Likelihood  (ML)  computation  or  nonlinear  optimization. 
These  algorithms  can  only  be  implemented  iteratively  which  limits  their  real-time  capabilities. 

The  primary  objective  of  this  part  of  the  project  is  to  study  whether  the  computational  simplicity  of  DFT  can 
be  effectively  combined  with  the  underlying  mathematical  framework  of  high-resolution  methods.  The  desired 
goal  is  to  achieve  high-resolution  without  any  iterative  optimization.  Specifically,  some  well-known  existing 
approaches,  such  as  the  Minimum-Norm  method  (MNM),  extract  the  signal  and  noise  subspace  information 
from  the  eigenvectors  of  the  Autocorrelation  (AC)  matrices.  It  is  shown  in  this  Section  that  the  DFT  of  the 
AC-matrix  (DFT-of-AC)  essentially  performs  an  equivalent  task  of  extracting  and  decoupling  the  signal  and 
noise  subspace  information.  Hence,  it  is  proposed  that  the  signal  eigenvectors  be  replaced  by  the  largest-norm 
DFT-of-AC  vectors.  It  is  demonstrated  that  when  the  DFT-of-AC  vectors  with  larger  norms  are  used  in  the 
MNM  framework,  mostly  better  or  almost  equivalent  high- resolution  DOA  estimates  are  produced.  The  bias, 
mean-squared  error  and  the  root  locations  of  the  proposed  DFT-based-MNM  (D-MNM)  compare  well  with  the 
Eigendecomposition-based  MNM  (E-MNM).  The  simulations  further  show  that  the  performance  of  the  D-MNM  is 
more  robust  at  low  SNR  and  it  has  superior  dynamic  range.  The  major  significance  of  the  proposed  work  is  that, 
no  complicated  iterative  optimization  is  needed  and  the  signal-subspace  information  is  extracted  only  by  a  single 
matrix  multiplication.  Hence,  hardware  implementation  of  D-MNM  for  real-time  high-resolution  DOA/Frequency 
estimation  may  be  feasible  with  currently  available  technology. 

I.  Problem  Definition 

This  part  of  the  report  addresses  the  problem  of  estimating  of  the  Directions  of  Arrival  (DOA)  of  densely 
spaced  narrowband  targets.  Suppose  that  p  plane  waves  originating  from  far-field  point  sources  at  distinct 
directions  impinge  on  a  linear  array  of  N  equally  spaced  sensors.  The  signal  sampled  simultaneously  at  mth 
instant  of  time  at  N  equally  spaced  sensors  form  a  ‘snapshot’  vector  defined  as, 

xm  A  [xm(0)  xm(l)  ...  xm(N  —  l)]f.  (/.I) 

In  the  presence  of  noise,  the  observation  samples  can  be  written  as, 

*m(n)  =  im(n)  -+  zm(n)  (1.2) 

where,  zm(n)  represents  the  additive  observation  noise  and/or  the  modeling  error  and  zm(n)  denotes  the  signal 
part  of  the  observation,  which  is  given  by 

*m(n)  =  '52Am(i)ejI*i{n  -  „  _  0,1 . N-  1  (1.3) 

i=l 

where, 

p  Number  of  narrowband  sources  present 


d  Spacing  between  sensor  elements 

\  Wavelength  of  radiation  of  the  received  signals 

0,  Direction-of- Arrival  (DOA)  of  the  i,A  source 

Am(i)  :  Amplitude  of  the  i,k  source  at  the  m,h  snapshot 
4>m(i)  '■  Phase  angle  of  the  itk  source  at  the  mtk  snapshot, 

Uniformly  distributed  between  —it  and  r. 

The  noise  zm(n)  is  assumed  to  be  zero-mean  and  uncorrelated  with  the  source  signals  and  it  has  a  variance  of  a]. 
The  signal  model  can  be  written  in  a  more  succinct  form  as, 

im(n)  =  (7.4) 

«=i 


where,  u>i  and  Aim  are  defined  as 


Ui  A  —  sin  Oi  and 
Aim  A  Am(i)e-j3^(^,mt‘  + 


(1.5) 

(1.6) 


Further  details  about  the  above  model  may  be  found  in  [11]. 
observation  matrix  can  be  written  as, 

X  A  TA 


With  the  above  formulation  the  model  for  the 


(/ 7) 


where, 


T  A 

1  1  ...  1 

ejw*  . . .  ejwr 

gjwiC/V-l)  cjtai»(JV- 1)  ejw,(N- 1) 

A  [ti  tj 

■  *p]> 

A  A  [®i  aj 

. . .  a*f  ]  and 

t 

II 0  II 

Aim 

Ajm 

■  Apm  - 

for  m  =  1,2, . . M. 

(/.8) 

(1.9) 

(I- 10) 

U- 11) 


For  half  wavelength  spacing  between  two  successive  sensors  of  the  line  array,  w,  =  it  sin  0, .  With  M  snapshot 
vectors  defined  in  (7.2),  the  N  x  M  observation  matrix  X  is  formed  as, 


X  A  [*!  x2  ...  xw],  (7.12) 

Using  the  observation  matrix,  the  spatial  covariance  matrix  can  be  estimated  as, 

C  A  i(XX")  (7.13a) 

x"*x"  (/136> 

m=l 


li 


The  description  of  the  observation  and  the  model  is  now  complete.  Given  the  noisy  observation  matrix  X,  the 
problem  under  consideration  in  this  part  of  the  report  is  to  estimate  Wj’s  and  Ajm’s.  Note  that  the  complex 
amplitudes  can  be  estimated  linearly  once  the  wj’s  are  known  but  the  estimation  of  poses  the  greatest  difficulty 
because  it  is  a  highly  nonlinear  optimization  problem. 


II :  Historical  Perspective 

It  is  apparent  from  the  problem  statement  that  the  DOA  (0,-)  estimation  problem  is  mathematically  equivalent  to 
the  Frequency  Estimation  (w;)  problem  which  has  been  a  major  research  topic  in  many  areas  of  science.  Indeed, 
in  the  last  couple  of  hundred  years,  the  search  for  ‘hidden  periodicities’  from  observed  data  has  appeared  in  varied 
forms  in  several  seemingly  differing  disciplines  of  science. 

To  appreciate  the  sustained  appeal  of  this  problem  to  researchers  over  the  past  two  centuries,  consider  that  as 
far  back  as  in  1795,  Prony  proposed  a  simple  procedure  to  estimate  the  parameters  of  a  multiple  Binusoids  model 
of  an  observation  record  [14,  39].  But  even  in  modern  signal  processing  literature,  useful  modifications  of  Prony ’s 
work  for  noisy  data  are  being  reported  [25,  57}.  About  hundred  years  following  Prony’s  work,  Schuster  had 
introduced  the  idea  of  periodogram  in  1898,  while  determining  the  periodicities  of  meteorological  phenomenon 
[46].  In  Schuster’s  time,  the  calculation  of  the  periodogram  was  computationally  a  very  expensive  procedure. 
But  with  the  advent  of  digital  computers  and  after  the  discovery  of  the  Fast  Fourier  Transform  (FFT)  algorithm 
by  Cooley  and  Tukey  [10],  the  periodogram  has  become  the  standard  choice  for  frequency/DOA  estimation  in 
a  variety  of  important  applications.  The  multiple  sinusoids  model  has  also  been  used  in  radio  astronomy  for 
analyzing  data  received  at  a  finite  aperture  telescope  to  resolve  the  locations  of  closely  spaced  stars  [4].  It  also 
has  wide  applications  in  geophysics,  radar,  sonar  and  biological  signal  analysis  and  ideas  emerging  from  a  large 
variety  of  fields  have  provided  a  certain  maturity  to  this  fundamental  problem. 

II.a  :  The  Resolution  Limitation  of  the  Periodogram 

To  date,  the  periodogram  continues  to  be  the  most  frequently  used  method  for  frequency/DOA  estimation 
[35,  41].  In  fact,  it  is  well  known  that  for  localizing  a  single  target,  if  the  noise  in  the  observed  data  is  Gaussianly 
distributed,  the  periodogram  [41]  produces  the  maximum  likelihood  estimate.  But  in  case  of  multiple  targets,  the 
periodogram  cannot  resolve  two  frequencies  which  are  separated  by  less  than  the  bin-width  of  the  FFT.  In  fact, 
when  the  sources  are  spaced  at  less  than  the  DFT  bin-width,  the  periodogram  fails  to  distinguish  two  closely 
spaced  frequencies  and  only  provides  a  single  frequency  estimate  instead  of  two.  The  last  statement  truly  portrays 
the  problem  one  faces  while  resolving  two  closely  spaced  sinusoids  when  a  relatively  short  data  record  is  available. 
Clearly,  if  any  amount  of  data  is  available  for  processing,  the  periodogram  of  sufficiently  zero-padded  data  will 
provide  reasonably  good  estimates.  But  in  many  problems  of  practical  interest  only  short  data  record  is  available 
and  one  has  to  overcome  the  periodogram ’s  resolution  limitation  by  resorting  to  what  are  commonly  known  in 
the  signal  processing  literature  as  ‘High-Resolution’  or  ‘Superresolution’  techniques.  The  major  contributions  in 
the  higher  resolution  approaches  are  highlighted  next. 

ILb  :  High-Resolution  Methods 

A  multitude  of  DOA/Frequency  Estimation  algorithms,  their  variations  and  analysis  are  available  in  the 
literature  [1-3,  5,  6,  9,  12,  13,  15-67].  In  the  following  paragraphs  only  some  of  the  major  developments  are  briefly 
discussed. 

Minimum  Variance  Method  :  In  order  to  improve  upon  Periodogram’s  resolution  limit,  Capon  had  proposed  this 
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linear  estimator  which  minimizes  the  interference  at  frequencies  outside  the  band  of  interest  [9].  its  performance 
has  been  shown  to  be  better  than  the  periodogram  estimator  but  worse  than  the  modeling  based  estimators  [31]. 

Model-Based  Methods  :  A  major  motivation  for  many  modern  high-resolution  frequency  estimation  methods 
has  come  from  the  desire  to  achieve  more  exact  models  for  the  sinusoids-in-noise  data.  In  the  Parameter  Es¬ 
timation  area  in  the  theory  of  Statistics,  it  had  been  well  established  that  Auto-Regressive  (AR)  modeling  is 
very  appropriate  for  modeling  data  with  peaky  spectra.  But  in  the  frequency  estimation  field  also,  it  had  been  a 
common  knowledge  that  data  composed  of  sinusoids  in  noise  tend  to  have  peaky  spectra.  Consequently,  frequency 
estimation  based  on  AR-modeling  has  received  considerable  attention  [7,  8,  16,  21,  32,  33,  39,  43,  60,  61]. 

Depending  on  how  the  autocorrelation  values  are  estimated,  there  are  three  types  of  AR  parameter  estimation 
methods,  namely,  Autocorrelation  method  [32],  Covariance  method  [32],  and  Modified  Covariance  method  (also 
known  as  the  Forward-Backward  method)  [33,  61].  The  later  two  cases  are  more  appropriate  for  sinusoidal 
processes  because  of  their  implicit  relationship  with  Prony’s  method  which  provides  perfect  frequency  estimates 
when  no  noise  is  present.  Incidentally,  the  Maximum  Entropy  method  proposed  by  Burg  [7,  8]  and  the  Linear 
Prediction  based  spectral  estimator  [32],  both  produce  essentially  identical  frequency  estimates  as  the  Covariance 
method. 

When  p  sinusoids  are  present  and  a  pth  order  AR  model  is  used,  the  frequency  estimates  are  found  to  be 
poor  at  low  SNR  (<  30 dB).  To  circumvent  this  hurdle,  larger  order  ( L  >  p)  AR  model  has  been  proposed  [25, 
60].  The  larger  model  order  tends  to  accommodate  a  major  part  of  the  interfering  noise  and  thereby  reduces  the 
effect  of  noise  in  the  estimates.  The  larger-order  approach  performs  poorly  below  20dB  SNR  [25]. 

Eigen- Analysis  of  the  Auto-Correlation  Matrix  of  Sinusoid-in-Noise  Data :  Since  the  mid-to-late  seventies,  a  whole 
new  class  of  algorithms  are  being  developed  by  effective  exploitation  of  the  special  properties  of  the  autocorrelation 
matrix  of  the  sinusoids-in-noise  data.  For  N  =  p  +  1,  the  eigendecomposition  of  C  was  first  utilized  by  Pisarenko 
[37]  who  showed  that  the  z-polynomial  formed  with  elements  of  the  eigenvector  corresponding  to  the  smallest 
eigenvalue  has  roots  at  the  signal  frequencies.  Though  the  idea  is  elegant,  Pisarenko’s  method  performs  quite 
poorly  for  noisy  signals.  Pisarenko’s  approach  was  later  improved  upon  by  Kumaresan  [25]  where,  for  N  >  p 
cases,  all  the  noise  eigenvectors  had  been  utilized.  As  an  alternate  approach,  it  was  shown  in  [25,  26]  that  the 
signal  subspace  eigenvectors  can  also  be  utilized  to  form  a  noise  subspace  vector  which  should  have  zeros  at  the 
signal  frequency  locations.  This  was  achieved  in  [25,  26,  40]  by  formulating  a  Minimum-Norm  criterion  which  is 
the  framework  that  will  be  used  in  the  proposed  work. 

Another  major  improvement  on  Pisarenko’s  approach  was  presented  by  Schmidt  [44,  45]  and  Bienvenue  and 
Kopp  [1,  2].  They  proposed  to  combine  the  eigenvectors  corresponding  to  the  ( L  —  p)  smaller  eigenvalues  of  C 
and  used  an  orthogonality  criterion  to  obtain  the  frequency  estimates.  In  the  literature,  this  approach  is  known 
as  the  ‘MUSIC’  method. 

It  may  be  pertinent  to  emphasize  here  that  the  approach  proposed  in  this  work  for  extracting  signal  or  noise 
subspace  ‘without  eigendecomposition’  may  be  combined  with  either  the  MNM  or  the  MUSIC  framework.  The 
MNM  framework  has  been  preferred  in  the  development  in  Subsection  IV  because  in  case  of  the  Minimum-Norm 
method,  the  frequencies  are  found  directly  from  the  polynomial  roots.  On  the  other  hand,  a  search  procedure 
is  necessary  in  case  of  MUSIC  for  estimating  the  frequencies.  The  polynomial  version  of  MUSIC,  known  as 
‘root-MUSIC’,  could  also  be  used  but  in  that  case  the  order  of  the  z-polynomial  would  be  twice  that  of  MNM. 

Maximum-Likelihood  Method  :  This  class  of  algorithms  maximize  the  likelihood  function  for  the  observed  data, 
leading  to  optimization  of  a  non-linear  criterion  which  can  only  be  performed  iteratively.  Several  different  ap¬ 
proaches  are  available  in  the  literature  [5,  27-29,  41,  42,  48-50,  52,  66]  among  which  the  recently  proposed 
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Constrained  MLE  approach  developed  [52]  by  the  present  author  appears  to  offer  the  most  accurate  results. 

Other  Methods  and  the  Motivation  for  the  Proposed  Method  :  As  listed  in  the  references,  there  are  a  large  number 
other  methods  that  address  the  high-resolution  Frequency /DOA  estimation  problems.  In  order  to  achieve  the  de¬ 
sired  high-resolution  capability,  all  these  algorithms  utilize  some  form  of  eigenanalysis  or  non-linear  optimization, 
both  of  which  are  computationally  intensive  for  real-time  applications.  The  primary  objective  of  this  project  is 
to  study  whether  the  computational  simplicity  of  DFT  can  be  effectively  combined  with  the  underlying  mathe¬ 
matical  framework  of  some  of  the  existing  high-resolution  methods.  The  final  goal  is  to  achieve  high-resolution 
without  any  iterative  optimization  such  that  real-time  implementation  may  be  feasible  with  existing  hardware. 
The  proposed  method  makes  use  of  the  special  properties  of  correlation  matrices  which  are  outlined  next. 


m  :  Some  Properties  of  the  Autocorrelation  Matrix 


Since  the  data  described  by  (7.3)  is  uncorrelated,  zero  mean  WSS  process,  the  N  x  N  (N  >  p)  covariance 
matrix  C  will  have  the  following  matrix  decomposition  when  there  is  no  observation  noise, 


C  =  TST' 


(7/7.1) 


where,  £  A  diag  (<r\  o\  ...  <r£)  and  <r?  denotes  the  power  of  the  the  s'-th  signal.  Note  that  this  ideal  C  has 
rank  p.  In  this  case,  the  eigen-decomposition  of  C  can  be  written  as, 

IV  (777.2a) 
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(777.26) 


For  observations  with  noise  as  defined  in  (7.3), 

C  =  TST"  +  <r)l. 


(777.3) 


Note  that  this  theoretical  C  has  rank  N  though  the  signal  part,  TSTH  has  rank  p.  In  this  case,  the  eigen- 
decomposition  of  C  can  be  written  as, 


CV  =  [ (A!  +  <r*)vx  •  •  •  (Ap  +  <rf)vp  <r?vp+i  ■  ■  ■  (r]vN  ]  (777.4) 

where,  the  Aj’s  and  <r\  represent  the  signal  and  noise  eigenvalues.  But  in  practice,  the  eigendecomposition  has  to 
be  performed  on  the  sample  covariance  matrix  C  as  defined  in  (7.13)  and  then  the  noise  eigenvalues  will  not  be 
equal  but  will  be  absorbed  with  the  signal  eigenvalues  also.  In  that  case, 

CV  =  [AiVi  •••  ApVp  Ap+1Vp+i  •••  Ajwat]  (777.5) 

where,  the  estimated  eigenvalues  are  ordered  as,  Ai  >  Aj  >  •••  Ajy.  The  eigenvectors  corresponding  to  the 
p  largest  eigenvalues  are  called  the  ‘signal  eigenvectors’  which  constitute  the  ‘signal-subspace’.  All  the  other 
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(N  -  p)  eigenvectors  are  known  as  the  ‘noise  eigenvectors’.  Note  also  that  the  p  ‘signal  eigenvectors’  of  C  span 
the  subspace  defined  by  the  columns  of  T  and  that  they  are  orthogonal  to  the  ‘noise  subspace’  eigenvectors. 

IV  :  The  Proposed  DFT-Based  Minimum-Norm  Method  (D-MNM) 


As  a  significant  departure  from  the  eigen-based  approaches  discussed  in  the  previous  section,  this  work  advocates 
that  the  signal-subspace  information  be  extracted  from  the  DFT-of-AC  matrix  which  can  be  accomplished  with 
a  single  matrix  multiplication.  This  will  eliminate  the  need  for  iterative  calculation  of  eigenvectors  which  is 
computationally  intensive.  The  central  idea  behind  the  DFT-of-AC  matrix  is  analyzed  first. 

IV. a  :  Signal  and  Noise  Subspace  Extraction  from  the  DFT-of-AC  Matrix 


Let  the  DFT  matrix  be  denoted  as, 


D  A  [ei  e2 


e/v]. 


(IV.  1) 


where,  the  elements  of  the  Jb-th  DFT-vector  e*  is  defined  as,  e*(l)  =  e^kl,  for  k,l  =  0,1,  2,  . . .,  N  —  1.  If  the 
frequencies  are  all  on  the  DFT  bins  and  if  there  is  no  observation  noise,  then  in  general, 


ft  A  Ce* 
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(IV.2a) 

(IV.2b) 

(IV.2c) 

(IV.2d) 


If  the  t-th  DFT  vector  e*  corresponds  to  one  of  the  w,  frequencies, 

.  M  .  M 

ft  =  T7  £  ^ImTam  =  T-  £  Alm&m 


m=l 


m=l 


JH  Em=l 

’d'ti' 

=  T 

ST  E^=l  l^tml2 

=  T 

*1 

■  Jf  22m= l  ^tm-^P"*  - 

■  d-tp. 

(/V.3) 


where,  &ki 8  denote  the  covariance  of  the  complex  amplitudes.  Assuming  the  number  of  samples  M  to  be  large 
and  since  Akm s  are  independent  random  variables,  A  &u  =  Hence, 


ft  -»  djfct*  =  diet. 


(IV.  4) 


Note  that  the  norm  of  ft  is  directly  proportional  to  the  signal  power,  i.e.,  this  norm  will  be  large  if  the  signal 
power  is  significant.  On  the  other  hand,  if  a  DFT-vector  et  does  not  correspond  to  any  of  the  frequencies  then 
due  to  orthogonality,  t^ei  =  0,  Vi.  For  such  cases, 


ft  =  0. 


(IV.  5) 
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For  this  ideal  case  then,  the  DFT-of-AC  has  the  following  decomposition, 

F  A  CD  ( IV.fa ) 

A  [fj  f2  •••  f*]  (IV.6b) 

—  [Am!  ■■■  A,uP  0  •••  0]  (IV.6c) 

where,  the  A,s  and  u<s  are  the  lengths  and  unit  vectors  of  each  fi,  respectively.  Note  that  the  unit  vectors  in  the 
matrix  in  ( IV.6c )  have  been  rearranged  so  that  the  zero/nonzero  components  are  clustered  t  *.  Interestingly, 

this  decomposition  appears  to  be  very  similar  to  the  usual  Eigendecomposition  of  noiseless  ;eal  C,  as  given 


by  (III. 2).  For  this  ideal  signal  scenario  again,  if  the  DFT-of-AC  is  formed  using  the  theoretical  and  noisy 
Covariance  matrix  of  (///.3),  then  the  decomposition  has  the  form, 


F  =  CD  (IV. la) 

=  T£T"D  +  a]  D  (IV.lb) 

— •  [(Ai+<r?)ui  •••  (Ap  +  <rJ)up  ff^Up+i  •••  o-Jujv],  (IV.lc) 

where  the  Uj’s  have  been  arranged  in  decreasing  order  of  lengths.  Note  again  that  this  decomposition  is  analogous 
to  the  one  in  (III A).  In  this  case  also,  the  p  largest-norm  vectors  of  the  DFT-of-AC  matrix  contain  the  signal 
subspace  information. 

In  practice,  the  u/js  will  not  be  on  the  DFT  bins  and  the  observations  may  also  be  noisy  and  hence,  the 
decomposition  in  (IV.6)  or  (IV.l)  will  not  hold.  But  the  DFT-components  (f*s)  closer  to  the  signal  frequencies 
will  tend  to  have  larger  norms  (this  is  further  analyzed  in  Subsection  VI).  Hence,  for  the  general  scenario,  when 
the  observation  data  is  noisy  and  the  angular  frequencies  w,s  are  arbitrarily  spaced,  the  signal/noise  subspace 
decomposition  can  be  formed  as  : 


F—  [Ami 
A  A  [Us 


ApUp  |  Ap+iUp+i 
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(/V.8a) 

(IVM) 


where,  Ai  >  Aj  >  •  •  •  >  Ajv  are  the  norms  of  the  f<  vectors  and  the  matrices  A,  Us  and  Uyv  are  formed  as, 
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(IV.  8c) 


It  may  be  observed  again  that  the  decomposition  in  (IV. 8)  is  analogous  to  the  eigen-based  counterpart  in  (///. 5). 
It  may  be  noted  here  that  in  case  of  the  ideal  signal  cases  of  (IV.6)  and  (IV.l),  an  unit  vector  m  corresponds  to 
one  of  the  DFT-vector  e*,  but  in  the  general  case  of  (IV A),  they  are  linear  combinations  of  the  DFT-components 
close  to  the  signal  frequencies  (the  general  case  is  further  analyzed  in  5:  section  VI). 

IV. b  :  Incorporation  of  DFT-Based  Signal  Subspace  in  Minimum-Norm  Framework 

The  principal  idea  behind  the  Minimum-Norm  method  is  to  form  an  appropriate  ‘noise-subspace’  vector  d 
which  is  orthogonal  to  the  ‘signal-subspace’  defined  by  Us.  Let, 


N-l 


D(z)  A  £  dkz~k 


(IV.9) 


k=  0 


16 


be  an  (N  —  l)-th  order  r-polynomial  with  p  zeros  at,  z*  =  eiu>k ,  for,  k  =  1,  . . .  ,p,  corresponding  to  the  DOAs. 
The  coefficient  vector  is  denoted  as, 

dA  [do  dx  ■■■  ds-i  ]T ,  (/V10) 

where,  do  =  1-  According  to  the  MNM  philosophy  [26],  if  Us  does  constitute  of  the  signal-subspace,  then  d 
must  be  orthogonal  to  Us,  ».«•, 

U5  d  =  0.  {IV .11) 

d  needs  to  be  found  by  solving  this  underdetermined  set  of  equations  which  has  infinite  number  of  solutions. 
According  to  [25,  26],  the  solution  that  also  minimizes  the  norm  ||d||2,  possesses  the  desirable  property  that  all 
its  roots  fall  inside  the  unit  circle.  This  ‘minimum-norm’  solution  of  d  for  solving  (/VM1)  can  be  expressed  as  : 


d  = 


1 

-  G"(GG")-1g 


(/V.12a) 


where,  U5  is  partitioned  as, 


U?  A  [g  |  G]. 


(/K126) 


Once  d  is  estimated,  the  p  roots  of  D(z)  closest  to  the  unit  circle  are  used  to  find  the  DOAs.  It  may  be  recalled 
that  in  E-MNM  the  signal-subspace  eigenvectors  vj,  V2,  ...,  vp,  as  defined  in  (///. 5)  are  used  to  form  U5  [25, 
26,  40].  But  in  case  of  the  proposed  approach,  no  eigendecomposition  is  necessary.  Post-multiplication  of  C  by 
the  DFT-matrix  D  is  all  that  is  required  to  extract  the  signal  subspace  in  {IV. 8). 


IV. c  :  Summary  of  the  Proposed  D-MNM  Algorithm 


The  key  steps  and  some  alternative  possibilities  are  summarized  in  this  Subsection. 


IV.c.l  :  Algorithm  Steps 


1.  Form  the  Covariance  Matrix  estimate  using  forward-backward  method  [25,  26]  : 


(7V.13) 


The  ‘backward’  vector  is  defined  as  x*,  A  Jx^,,  where,  J  denotes  the  permutation  matrix  with  l’s  at  the 
cross-diagonal  entries  and  *  denotes  the  complex-conjugate  operation. 


2.  Post-multiply  C  by  the  DFT  matrix  D  to  form  the  DFT-OF-AC  matrix,  F  A  CD. 

3.  Form  U  as  in  (IV.8c)  using  the  p  unit  vectors  corresponding  to  the  largest  norms.  Partition  Us  as  in 
(TV.  126). 


4.  Estimate  the  d  vector  using  {IV.\2a)  and  form  the  D(z)  polynomial  using  the  elements  of  d. 


5.  Find  the  roots  of  D{z).  Pick  the  p  roots  closest  to  the  unit  circle  to  find  the  desired  frequencies/DOAs. 


IV.C.2  :  Alternate  Possibilities 


Steps  2  and  3  :  Post-multiplication  of  the  AC-matrix  by  a  DFT-matrix  has  been  used  here  because  the  decompo¬ 
sitions  as  described  in  Subsection  III  appear  analogous  to  eigendecomposition.  But  it  is  easy  show  that  identical 
results  can  be  obtained  if  the  AC-matrix  is  pre-multiplied  by  a  DFT  matrix,  t.e.,  the  DFT-of-AC  matrix  can  also 
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be  formed  alternately  aa,  F i  A  DC.  In  that  case,  the  largest  norm  row  vectors  of  the  DFT-of-AC  matrix  Fi 
must  be  used  to  form  Ug  defined  in  ( / V.  13). 

Step  4  ■  This  step  requires  inversion  of  a  matrix  of  dimension  (N  -  1)  x  (N  -  1).  This  can  be  avoided  by  orthog- 
onalizing  the  p  largest  norm  vectors  in  Us.  Let,  U|  be  the  new  ‘signal-subspace’  matrix  with  the  orthonormal 
set  of  vectors  which  can  be  written  in  partitioned  form  as, 


U°s"  A  [g0  |  Go]. 


With  these  partitioned  matrices,  d  can  again  be  found  in  Step-4  as  [25,  26], 


(JV.14) 


L  “  G"g«/(1  -  g"go)J 


(IVA5) 


It  may  be  mentioned  here  that  in  [25,  26],  p  orthonormal  signal  eigenvectors  were  used  to  form  U5,  whereas  here 
U|  is  formed  by  orthogonalizing  the  p  largest  norm  vectors  of  the  DFT-of-AC  matrix. 

Step  5  :  This  step  requires  rooting  of  the  ( N  —  l)-th  order  polynomial  D(z).  Instead,  the  frequencies  may  also 
be  found  from  the  peaks  of  the  following  minimum-norm  pseudo-spectrum  [25,  26,  55]  : 


PmnhW")  A 


(/V.  16) 


V :  Simulation  Results 

In  this  Subsection  the  performance  of  the  proposed  algorithm  is  compared  with  some  of  the  existing  well- 
known  algorithms  using  commonly  used  data  sets.  For  the  purpose  of  simulations,  DOA  and  Frequency  Estimation 
problems  are  treated  separately. 

V.a  :  DOA  Estimation 

Simulation  1  :  Two  Densely-Spaced  Targets  of  Equal  Powers  [53,  54] 

Planewaves  from  p  =  2  sources  with  8\  =  18°  and  82  =  22®  incident  on  N=8  sensors  were  modeled  as  in  [26, 
27,  29].  The  number  of  snapshots,  M=10.  Fig.  I  shows  the  norms  of  the  fi  vectors  for  20  trials  at  20dB  SNR. 
The  two  largest  A,s  always  appear  to  be  more  significant  than  all  the  smaller  ones.  Figures  2a  and  2b  show  the 
roots  of  D(z)  for  50  independent  realizations  using  D-MNM  and  E-MNM,  respectively.  The  figures  show  that 
the  roots  in  both  cases  are  at  almost  same  locations.  Table-1  compares  E-MNM  and  D-MNM  in  terms  of  the 
bias  and  RMS  values  with  200  independent  trials  at  different  SNR  values.  The  results  clearly  indicate  that  the 
performance  of  D-MNM  is  quite  close  to  that  of  E-MNM,  though  no  Eigendecomposition  was  required  in  this 
case.  In  fact,  D-MNM  was  found  to  be  somewhat  more  robust  (in  terms  of  successful  trials)  at  low  SNR  ranges. 

Simulation  2  :  Comparison  of  Dynamic  Range  with  Two  Targets  of  Unequal  Powers  [53,  54] 

For  this  example,  two  well-separated  sources  are  located  at  halfway  between  corresponding  DFT-bins  with 
81  —  7.1808®  and  81  =  61.045®.  The  SNR  for  8\  is  maintained  at  a  fixed  value  of  30dB  whereas  the  SNR 
for  the  signal  at  82  is  gradually  reduced  from  30dB  to  OdB  in  steps  of  2dB.  The  mean  values  with  50  independent 
trials  at  each  SNR  for  D-MNM  and  E-MNM  are  displayed  in  Fig.  3.  Clearly,  D-MNM  demonstrates  superior 
dynamic  range  than  E-MNM. 
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V.b  :  Frequency  Estimation 

In  this  Subsection,  the  proposed  algorithm  is  compared  with  the  well-known  Tufts-Kumaresan  (TK)  method  [25, 
57]  and  MUSIC  method  [1,  2,  44,  45]  via  simulations. 

Simulation  3  :  Comparison  of  High- Resolution  Performance  and  Threshold  Enhancement 

The  simulation  data  is  generated  using  the  formula  [25,  57]  : 

y(n)  =  5>»+>i  +  aaei>'(°  «>»  +  u;(n),  for,  n  =  0,  1 . M  -  1  (VI) 

where,  w(n)  is  complex  white  Gaussian  noise  with  variance  <r* .  The  number  of  data  samples  used  is,  M=25. 
This  data  set  has  been  widely  used  in  the  literature  for  studying  the  performance  of  various  methods.  For  this 
data  set,  it  has  been  shown  in  [25,  57]  that  the  TK  method  performs  best  when  high-order  (L  x  L)  covariance 
matrix  with  L  =  18  is  used  with  forward-backward  covariance  matrix  [17,  55].  Five  hundred  independent  noise 
realizations  were  used  to  compare  the  performance  of  the  proposed  method  with  that  of  TK  method  and  MUSIC. 
The  mean  values  for  three  cases  at  different  SNR  values  are  displayed  in  Fig.  4.  The  RMSE  results  are  shown 
in  Fig.  5  along  with  CR  Bound  for  the  frequency  at  fi  =  0.52 Hz.  The  bias  and  RMSE  at  different  SNR  values 
are  also  tabulated  in  Table  2.  Clearly,  the  proposed  method  extends  the  performance  threshold  closer  to  the  CR 
bound.  Hence  the  performance  of  the  proposed  method  approaches  that  of  the  Maximum-Likelihood  method 
more  closely. 

VI  :  ANALYSIS,  DISCUSSION  AND  DIRECTIONS  ON  FURTHER  RESEARCH 

The  results  presented  so  far  are  quite  intriguing  and  can  be  expected  to  have  far-reaching  consequences  on 
simplifying  the  present  practice  of  frequency/DOA  estimation.  The  proposed  approach  of  forming  signal-subspace 
using  DFT  without  any  eigendecomposition  also  opens  up  whole  new  avenues  for  further  research  and,  at  the 
same  time,  poses  some  unanswered  questions.  Furthermore,  it  may  be  possible  to  extend  and  incorporate  similar 
ideas  in  other  closely  related  problems  or  to  develop  more  simplified  algorithms.  The  theoretical  performance 
of  the  method  needs  to  be  thoroughly  analyzed.  The  major  advantage  of  the  proposed  approach  is  that  all  the 
signal-subspaces  are  obtained  with  a  single  matrix  multiplication.  This  step  may  be  performed  using  FFT  which 
is  very  efficient  for  hardware  and  software  implementation.  Preliminary  analysis  of  the  proposed  work  and  some 
directions  for  further  research  are  briefly  outlined  in  this  section. 

1.  Reduced  Computational  Complexity  and  Usefulness  in  High  Sampling- Rate  Problems  :  The 
major  significance  of  D-MNM  is  that  its  high-resolution  capability  does  not  rely  on  any  iterative  method 
or  eigendecomposition  which  is  also  computed  iteratively.  The  lower  computational  complexity  of  D-MNM 
should  be  attractive  in  any  general  frequency/DOA  estimation  scenario.  But  the  usefulness  of  the  proposed 
method  should  be  specially  significant  in  those  applications  where  traditional  high-resolution  methods  are  yet 
to  make  much  inroads  due  mainly  to  extremely  high  sampling  rate  requirements.  Specifically,  in  Electronic 
Warfare  (EW)  applications,  the  signals  usually  operate  in  the  GHz  range  but  real-time,  high-resolution 
capability  is  a  necessity  [56].  Currently  no  EW  receiver  processes  signals  entirely  in  digital.  The  proposed 
DFT-based  MNM  with  its  low  computational  complexity,  is  expected  to  provide  the  desired  high-resolution 
capability  to  future  digital  EW  receivers. 

2.  Signal-Subspace  Information  from  the  Autocorrelation  Matrix  Only  :  The  strength  of  the 
Minimum-Norm  framework  as  a  high-resolution  method  really  comes  from  its  ability  to  form  the  ‘noise- 
subspace’  vector  d  by  exploiting  the  orthogonality  property  in  (IV.ll).  It  appears  that  as  long  as  U5  has 
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some  component  of  the  signal-subepace  T,  the  solution  of  (/V'.ll)  would  retain  its  high-resolution  capabil¬ 
ity.  The  DFT-of-AC  is  an  appropriate  candidate  to  produce  Us  because  it  is  a  linear  combination  of  the 
signal-vectors  in  T.  This  can  be  seen  by  rewriting  the  DFT-of-AC  matrix, 


F  =  CD 


i  £  M*£d) 


m=  1 


(VJ.l) 


In  fact,  the  AC  matrix  itself  is  also  a  possible  candidate  for  obtaining  the  ‘signal-subspace’  U5,  because  it 
can  be  expressed  as  a  linear  combination  of  the  signal- vectors  in  T, 
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(VI. 2) 


Not  surprisingly,  when  Us  is  formed  with  the  p  largest  norm  vectors  of  the  estimated  C,  MNM  again 
demonstrated  high-resolution  capability  in  simulations  (not  included).  This  simpler  procedure  to  obtain 
‘signal-subspace’  information  needs  to  be  studied  further.  But  it  must  be  stated  that  D-MNM  performs 
better  at  low  SNR  because  the  DFT  operation  accentuates  the  signal-subspace,  as  discussed  next. 


3.  Analysis  of  the  DFT-based  Signal  ^ubspace  for  Arbitrary  DO  A/ Frequency  :  For  ideal  noise-free 
observations  if  the  frequencies  are  not  on  the  DFT  bins,  the  DFT-of-AC  operation  can  be  expressed  as  : 
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Consider  the  matrix  at  right.  Each  of  the  tf*D  vectors  are  complex  valued  DFT  of  a  sequence  of  a  complex 
sinusoid.  The  magnitude  of  each  row  vector,  tf*D  has  a  Sine  envelope  with  a  peak  occurring  at  the  column 
corresponding  to  the  bin  location  closest  to  the  frequency  u/j.  For  infinite  aperture  with  N  — *  00,  i.e.,  for 
large  number  of  sensors,  each  row  vector  peaks  at  u/,-  and  the  other  elements  of  that  row  approaches  zero. 
The  same  will  be  the  case  for  each  of  the  other  row  vectors  also.  Hence,  asymptotically,  the  DFT-of-AC 
operation  again  produces  p  largest  norm  vectors  at  the  true  signal  frequencies.  The  asymptotic  analysis  for 
the  noisy  case  as  defined  by  (IV  .7)  would  also  provide  similar  results.  For  finite  N,  because  of  the  Sine 
weighting,  the  largest  norm  vectors  will  also  have  contributions  from  some  other  t<  vectors  in  the  T.  But 
those  components  also  contain  signal-subspace  information  which  is  orthogonal  to  d  and  hence  useful  for 
obtaining  the  minimum-norm  vector  d.  Further  analysis  in  this  direction  will  be  performed. 


4.  Performance  and  Accuracy  Analysis  :  From  the  results  presented  in  Subsection  V,  the  DFT-of-AC 
operation  appears  to  retain  significant  signal  information  comparable  to  signal  eigenvectors  produced  by 
eigendecomposition.  This  phenomenon  needs  to  be  quantified  analytically.  A  possibility  would  be  to  analyze 
and  compare  the  respective  Frobenius  norms  of  the  Projections  onto  the  true  signal  basis-space  as  produced 
by  the  signal-subspaces  due  to  the  eigen-based  as  well  as  DFT-based  methods.  Most  of  the  existing  eigen- 
based  methods  have  been  analyzed  to  study  their  performance  and  accuracy  [20,  22,  38,  64,  65].  Following 
this  trend,  we  plan  to  perform  statistical  analysis  of  the  bias,  variance  and  the  resolution  threshold  of  the 
estimates  produced  by  the  present  method. 
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5.  Estimation  of  the  Parameters  of  Damped  Sinusoids  in  Noise :  Many  eigen-based  methods  have  been 
successfully  utilised  in  estimating  the  unknown  parameters  of  damped  sinusoids  from  noisy  observations  [24, 
25].  It  appears  that  with  some  simple  modifications  the  proposed  DFT-based  approach  could  also  be  used 
for  the  same  purpose.  The  advantage  would  again  be  that  no  eigendecomposition  but  the  performance  will 
be  comparable. 

6.  Largest  Norms  vs.  Peaks  :  In  all  the  simulations  presented  here,  the  signal  subspaces  have  been  formed 
by  selecting  the  p  unit-vectors  having  largest  norms.  But  the  ideal  solution  may  be  to  pick  the  unit  vectors 
corresponding  to  the  p  largest  peaks  (having  smaller  norm  vectors  on  both  adjacent  bins).  This  may  eliminate 
any  possibility  of  picking  multiple  vectors  from  the  vicinity  of  strong  signals.  It  should  be  emphasized  though 
that  largest  norm  criteria  has  worked  quite  well  so  far,  as  demonstrated  by  a  large  number  of  simulations. 
But  this  aspect  certainly  needs  further  analysis. 

7.  Zero  padding  :  In  classical  spectral  estimation,  Periodogram  relies  on  DFT/FFT,  but  it  is  often  necessary 
to  extend  (or,  pad)  the  available  data  with  zeros  so  that  interpolated  values  between  available  bins  can  be 
calculated.  Zero-padding  is  also  used  to  extend  data-lengths  to  powers  of  two  such  that  the  computational 
efficiency  of  the  FFT  can  be  taken  advantage  of.  In  the  simulation  studies,  no  zero-padding  had  been 
incorporated  so  far.  It  is  not  quite  apparent  whether  the  zero-padding  should  be  done  directly  to  the  data 
or  to  the  covariance  estimates  and  this  aspect  needs  further  study.  It  would  also  be  necessary  to  study  the 
possible  effects  on  the  signal-subspace  produced  by  the  DFT-of-AC  operation  after  zero-padding  is  introduced. 

8.  Windowing  :  In  classical  spectral  estimation,  in  order  to  avoid  sudden  discontinuities,  the  observed  data  is 
often  weighted  (or  tapered  at  both  ends)  by  non-rectangular  window  which  tends  to  enhance  the  ‘dynamic 
range’  at  the  cost  of  ‘resolution’  [17].  In  the  simulation  results  presented  here,  no  windowing  has  been  used. 
But  windowing  is  known  to  be  highly  effective  in  locating  weak  frequency  components  which  tend  to  get 
submerged  by  the  sidelobes  of  strong  components.  Though  it  is  believed  that  that  orthogonality  property  in 
(/V.ll)  is  the  main  contributing  factor  for  the  high-resolution  capability  of  D-MNM,  it  would  certainly  be 
interesting  to  study  what  effects  windowing  might  have  on  the  performance  of  D-MNM. 

9.  Use  of  DFT-Based  Signal-Subspace  in  other  Eigen-Based  Methods  :  Other  than  the  Minimum- 
Norm  Method,  there  is  a  large  body  of  work  where  some  form  of  eigendecomposition  is  utilized  to  estimate 
DOA/Frequencies  [1-3,  6,  15,  18,-21,  23-26,  30,  34,  36-38,  40,  44,  45,  47,  51,  57,  58,  62-65,  67].  Among  the 
more  important  results  are,  MUSIC  [44,  45],  SVD  [25,  26]  and  ESPRIT  [36].  It  is  quite  possible  that  the 
proposed  DFT-based  signal-subspace  may  be  incorporated  with  some  of  these  existing  eigendecomposition 
based  methods,  in  order  to  implement  those  methods  without  eigendecomposition.  Clearly,  the  proposed 
approach  can  be  used  to  implement  MUSIC,  except  that  the  noise  subspace  U/v  defined  in  (IV.Sc)  would 
have  to  be  utilized.  Also,  the  left  and  right  eigenvectors  of  the  SVD  of  a  data  matrix  are  actually  the 
eigenvectors  of  correlation  matrices.  Hence,  it  appears  that  some  of  the  SVD-based  approaches  may  also  be 
modified  to  incorporate  DFT-based  signal/noise  subspaces.  Care  should  be  taken  about  the  choice  of  either 
the  left  or  right  signal-spaces,  because  both  may  not  contain  signal  information.  The  case  is  not  so  apparent 
for  those  methods  which  use  generalized  eigendecomposition  [36,  51,  63].  Some  of  these  possibilities  will  be 
further  investigated. 

10.  Model  Order  Selection  :  In  its  current  form,  the  proposed  approach  assumes  that  the  number  of  targets 
(p)  is  known.  But  Fig.  1  and  the  analysis  in  Subsection  IV  suggest  that  it  may  be  possible  to  estimate  the 
model  order  from  the  norm  of  the  DFT-of-AC  vectors.  This  possibility  will  be  explored  further. 
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11.  DFT-Prony  :  There  has  been  some  recent  interest  in  implementing  the  Prony’s  algorithm  in  the  Frequency- 
Domain  [43].  Clearly,  the  signal- vectors  in  Us  can  be  treated  as  multiple  time-series  to  form  a  (p+ 1)  x  (p+ 1) 
covariance  matrix  (using  forward-backward  approach)  and  then  the  p-th  order  Prony’s  polynomial  can  be 
estimated.  Based  on  preliminary  simulations  (not  included),  this  approach  appears  to  be  simplest  of  all 
existing  methods  with  moderately  good  high-resolution  performance.  The  performance  of  DFT-Prony  is 
much  better  than  that  of  the  standard  Prony’s  method  because  the  DFT-based  signal  subspace  is  cleaned- up 
though  without  any  eigendecomposition.  These  ideas  needs  to  be  further  studied. 

12.  Two-Dimensional  Frequency- Wavenumber  Estimation :  In  some  array  processing  scenarios,  both  the 
DOAs  (related  to  wavenumbers)  and  the  center  frequencies  need  to  be  estimated.  Many  existing  1-D  eigen- 
based  methods  have  been  extended  to  2-D  to  address  this  problem.  It  appears  that  the  DFT-of-AC  vectors 
can  be  formed  in  both  domains  and  two  D(z)  polynomials  can  be  formed  to  estimate  the  the  frequencies  and 
DOAs  separately.  Incorporation  of  the  DFT-based  signal-spaces  for  2D  frequency  estimation  will  be  further 
investigated  at  later  stage. 

13.  Hardware  Implementation  :  Perhaps  the  most  important  and  useful  practical  impact  of  the  proposed 
method  would  be  in  the  area  of  hardware  implementation  for  high-resolution  Direction-of-Arrival  or  frequency 
estimation.  All  the  currently  available  methods  with  good-enough  high-resolution  capability,  rely  on  some 
form  of  iterative  optimization  or  iterative  computation  of  eigenvectors  [1-3,  5,  6,  15,  18,  19,  21,  23-30,  34,  36, 
37,  40-44,  45,  47-59,  62,  63,  65-67],  In  contrast,  all  that  the  proposed  approach  requires  to  form  the  ‘signal- 
subspace’  is  a  single  matrix  multiplication.  Furthermore,  the  matrix  to  be  multiplied  is  a  DFT  matrix  and 
it  has  special  structures  so  that  FFT  based  processing  may  be  utilized  to  further  reduce  the  computational 
burden.  Hence,  one  of  the  major  goals  of  the  proposed  work  would  be  to  devise  appropriate  strategies  to 
design,  develop  and,  if  possible,  fabricate  VLSI  hardware  for  high-resolution  DOA/Frequency  estimation. 
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Fig.4.  Comparison  of  Mean  values  with  500  independent 
trials  for  three  methods. 


Comparison  of  Performance 


SNR 


Fig.5.  Comparison  of  RMS  values  with  CR  bounds  for  500 
independent  trials. 


Table  2.  Comparison  of  Bias  and  RMS  values  for  three  methods 
with  500  independent  trials. 
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IMPACT  OF  THE  PROPOSED  RESEARCH 


The  proposed  idea  of  extracting  Signal  and/or  Noise  subspace  information  from  the  DFT  of  AC  matrix  by  a 
single  matrix  multiplication  is  expected  to  have  far-reaching  consequences  on  simplifying  the  present  practice  of 
frequency/DOA  estimation.  Some  preliminary  results  on  the  work  has  been  recently  accepted  as  a  brief  Journal 
paper  [54].  The  following  is  a  verbatim  quote  from  one  of  the  anonymous  reviewers, 

“  . . .  contains  what  I  think  is  a  significant  breakthrough  ( comparable  to  spatial  smoothing)  in  this  narrow  field 
of  multiple  source  location.” 

This  work  will  also  be  presented  at  ICASSP-94  [53].  Some  possible  impacts  of  the  proposed  work  are  briefly 
discussed  next. 

1.  Reduced  Computational  Complexity  and  Usefulness  in  High  Sampling-Rate  Problems  :  One 
of  the  major  advantages  of  the  proposed  work  is  that  the  entire  signal-subspace  is  obtained  with  a  single 
matrix  multiplication.  In  many  cases,  this  operation  may  be  performed  using  FFT  which  is  very  efficient  for 
hardware  or  software  implementation.  The  lower  computational  complexity  of  D-MNM  should  be  attractive 
in  any  general  frequency/DOA  estimation  scenario.  But  the  usefulness  of  the  proposed  method  should  be 
specially  significant  in  extremely  high  sampling  rate  problems,  as  for  example,  in  Electronic  Warfare  (EW) 
applications. 

2.  Signal-Subspace  Information  from  the  Autocorrelation  Matrix  Only  :  This  possibility  has  been 
briefly  discussed  in  item-2  of  Subsection  VI.  This  method  appears  to  be  much  simpler  for  implementation 
than  the  DFT-of-AC  case.  If  found  to  be  viable  for  high-resolution  frequency  estimation  at  moderate  to  high 
SNR,  as  the  preliminary  simulations  seem  to  indicate,  this  method  may  offer  further  reduced  computational 
complexity. 

3.  Performance  and  Accuracy  :  From  the  results  presented  in  Subsection  V,  the  DFT-of-AC  operation 
appears  to  produce  similar  or  better  estimates  than  its  eigen-based  counterpart,  especially  at  low  SNR.  This 
trend  appears  to  be  true  for  all  examples  attempted.  In  fact,  Simulation-3  demonstrates  that  the  results  of 
the  proposed  method  are  closer  to  the  CR  bounds  and  hence,  its  performance  is  more  similar  to  the  MLE 
than  the  original  eigen-based  MNM.  This  trend  needs  to  be  justified  theoretically. 

4.  Estimation  of  the  Parameters  of  Damped  Sinusoids  in  Noise  :  It  is  expected  that,  analogous  to  the 
undamped  case,  the  proposed  method  would  also  produce  superior  estimates  at  lower  cost  for  the  damped 
sinusoidal  estimation  problem. 

5.  Zero  padding  and  Windowing  :  Similar  to  results  in  classical  spectral  estimation,  the  proposed  DFT- 
based  preprocessing  can  be  expected  to  offer  equivalent  performance  improvements  due  to  zero-padding  and 
windowing. 

6.  Use  of  DFT- Based  Signal- Subspace  in  other  Eigen- Based  Methods  :  It  is  expected  that  the  proposed 
DFT-based  signal-subspace  can  be  incorporated  in  some  of  the  other  existing  eigendecomposition  based 
methods,  in  order  to  implement  those  methods  without  eigendecomposition  and  hence,  at  reduced  cost. 
Most  importantly,  analogous  versions  of  MUSIC  and  SVD-based  methods  appear  to  have  certain  feasibility. 

7.  Model  Order  Selection  :  It  may  be  possible  to  obtain  a  rough  estimate  of  the  model  order  or  number  of 
targets  from  the  norms  of  the  DFT-of-AC  vectors. 
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8.  DFT-Prony  :  This  approach  has  also  been  outlined  briefly  in  Subsection  VI.  Based  on  preliminary  simula¬ 
tions,  this  method  appears  to  be  the  simplest  of  all  existing  methods  with  moderately  good  high-resolution 
performance.  This  idea  will  be  further  studied. 

9.  Two-Dimensional  Frequency- Wavenumber  Estimation  :  Incorporation  of  the  DFT-based  signal- 
spaces,  in  place  of  eigendecomposition,  for  2D  frequency- wavenumber  estimation  is  expected  to.  offer  the 
same  advantages  as  in  case  of  the  ID  problem  described  here. 

10.  Hardware  Implementation  :  The  proposed  approach  forms  the  ‘signal-subspace’  with  the  multiplication 
of  a  single  DFT-matrix  which  possesses  some  special  structures.  Hence,  it  may  be  possible  to  take  advantage 
of  currently  available  FFT  hardware  to  pre-process  the  signal  for  extracting  the  signal/noise  subspaces. 
This  is  expected  to  be  another  step  closer  towards  designing  and  fabricating  efficient  hardware  for  real-time 
high-resolution  DOA/Frequency  estimation. 
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Section  -  2.2  :  MAXIMUM-LIKELIHOOD  METHOD  WITH  EXACT  CONSTRAINTS 


Summary 

A  recently  proposed  approximate  Maximum- Likelihood  Estimator  (MLE)  of  multiple  exponentials,  converts 
the  frequency  estimation  problem  into  a  problem  of  estimating  the  coefficients  of  a  2-polynomial  with  roots  at 
the  desired  frequencies  [1,  2],  Theoretically,  the  roots  of  the  estimated  polynomial  should  fall  on  the  unit  circle. 
But  MLE,  as  originally  proposed,  does  not  guarantee  unit  circle  roots.  This  drawback  sometimes  causes  merged 
frequency  estimates,  especially  at  low  SNR  [1,  3].  If  all  the  sufficient  conditions  for  the  z-polynomial  to  have 
unit  circle  roots  are  incorporated,  the  optimization  problem  becomes  too  nonlinear  and  it  loses  the  desirable 
weighted-quadratic  structure  of  MLE.  In  this  work,  the  exact  constraints  are  imposed  on  each  of  the  lst-order 
factors  corresponding  to  individual  frequencies  for  ensuring  unit  circle  roots.  The  constraints  are  applied  during 
optimization  alternately  for  each  frequency.  In  the  absence  of  any  merged  frequency  estimates,  the  RMS  values 
more  closely  approach  the  theoretical  Cramer-Rao  (CR)  bound  at  low  SNR  levels. 

I.  Introduction 

Estimating  the  underlying  parameters  of  multiple  complex  exponential  signals  in  noise  remains  a  vigorously 
researched  topic  in  signal  processing  [1-13].  For  a  single  sinusoid  or  when  the  multiple  frequencies  are  well- 
separated,  the  Periodogram  performs  reasonably  well.  But  if  the  frequencies  are  closely  spaced,  which  often  occurs 
when  the  data  length  is  limited  or  the  aperture  is  too  small,  the  Periodogram  fails  to  distinguish  the  frequencies 
and  produces  merged  frequency  estimates.  In  order  to  overcome  the  Periodogram ’s  resolution  limitation,  many 
high-resolution  methods  have  been  developed  in  the  past  two  decades  [1-13].  In  contrast  to  the  Periodogram, 
these  methods  make  effective  use  of  some  underlying  property  of  the  true  sinusoidal  signal  model. 

Among  all  the  existing  high-resolution  frequency  estimation  methods,  the  MLE  appears  to  provide  the  most 
accurate  frequency  estimates  and  has  the  lowest  SNR  threshold  [1-4],  Other  high-resolution  methods  rely  on 
signal  or  noise  subspace  information  which  is  extracted  from  the  eigendecomposition  of  covariance  matrix  or  SVD 
of  data  matrix  [5,  7-11].  On  the  other  hand,  the  MLE  considers  the  exact  model  of  the  exponential  signal  and 
attempts  to  maximize  the  exact  likelihood  function  to  estimate  the  unknowns.  For  a  single  sinusoid,  the  peak  of 
the  periodogram  itself  corresponds  to  the  ML  estimate,  but  for  multiple  exponentials  the  MLE  turns  out  to  be  a 
nonlinear  optimization  problem  [1-6,  12,  13]. 

The  MLE  approaches  developed  independently  in  [1]  and  [2],  estimate  the  frequencies  from  the  roots  of  a 
z-polynomial.  It  may  be  noted  here  that  in  literature,  these  methods  are  sometimes  referred  to  as  KiSS  [1,  5, 
6]  or  IQML  [2].  In  the  polynomial  domain,  the  ML  optimization  problem  turns  out  to  be  quasi-linear  where  a 
weighted-quadratic  criterion  is  minimized  iteratively.  Though  effective  to  a  large  extent,  MLE  is  known  to  possess 
one  fundamental  drawback  :  the  optimization  procedure  in  [1,  2]  does  not  impose  sufficient  theoretical  constraints 
on  the  polynomial  coefficients  for  the  estimated  roots  to  fall  on  the  unit  circle.  The  primary  goal  of  this  work  is 
to  address  this  unresolved  problem  in  MLE. 

Two  conditions  must  be  satisfied  for  a  general  p-th  order  z-polynomial  to  have  p  unit  circle  roots  :  conjugate 
symmetry  (Cl)  and  a  derivative  constraint  (C2),  the  details  of  which  are  given  later.  In  MLE,  only  Cl  is 
imposed.  The  derivative  constraint  makes  the  problem  highly  nonlinear  and  hence,  C2  can  not  be  incorporated 
in  the  weighted-quadratic  framework  of  MLE.  But  when  p  >  1,  Cl  alone  is  not  sufficient  for  unit  circle  roots. 
Furthermore,  from  the  theory  of  Linear-Phase  FIR  filters,  it  is  well-known  that  the  roots  of  a  symmetric  z- 
polynomial  may  fall  either  on  the  unit  circle  or  they  may  be  in  reciprocal  pairs  falling  inside  and  outside  of 
the  unit  circle.  In  fact,  it  was  demonstrated  in  [1]  and  [3]  that,  if  SNR  <  lOdB  and  the  frequencies  are  spaced 
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closely,  the  roots  extracted  by  MLE  sometimes  appear  in  reciprocal  pairs.  In  such  cases,  two  frequencies  merge 
to  produce  only  a  single  frequency  estimate.  The  alternate  approach  proposed  in  this  work  attempts  to  alleviate 
this  limitation. 

There  is  one  exception  to  the  two  conditions  stated  above  :  for  p  =  1,  the  conjugate  symmetry  constraint 
(Cl)  alone  is  sufficient  for  the  single  root  to  fall  on  the  unit  circle.  This  is  the  main  idea  which  will  be  utilised 
in  developing  the  proposed  Constrained-MLE  (C-MLE)  algorithm.  Specifically,  Cl  will  be  imposed  on  each  of 
the  lst-order  factors  of  the  p-th  order  z— polynomial,  such  that  each  individual  root  falls  on  the  unit  circle.  This 
process  need  not  be  applied  to  all  the  frequencies  at  all  SNRs.  The  constraints  are  imposed  only  on  those  lst-order 
factors  which  produce  merged  frequency  estimates  at  convergence  of  MLE.  The  factors  for  which  the  roots  are 
already  on  the  unit  circle,  are  held  fixed.  The  proposed  algorithm  may  be  considered  to  be  a  polynomial-domain 
counterpart  of  the  ‘Alternating  Projection’  approach  [13]  where  the  ML  criterion  is  minimized  w.r.t.  one  frequency 
at  a  time  while  the  other  frequencies  are  held  at  the  previously  estimated  values.  Our  work  appears  to  be  the 
first  attempt  to  guarantee  unit  circle  roots  on  the  polynomial  coefficients  for  Maximum-Likelihood  frequency 
estimation.  The  constraints  are  primarily  effective  at  low  SNR  levels  when  there  is  a  higher  possibility  for  MLE 
to  produce  merged  frequency  estimates.  In  simulations,  the  RMS  values  of  the  frequency  estimates  using  C-MLE 
were  found  to  be  closer  to  the  theoretical  CR  bounds  than  those  of  the  original  MLE  algorithm. 

The  Section  is  arranged  as  follows  :  In  Subsection-II,  the  ML  problem  is  stated  and  the  original  MLE 
algorithm  is  briefly  discussed  and  the  conditions  needed  for  unit  circle  roots  are  stated.  In  Subsection  III,  the 
proposed  constrained  version  of  MLE  is  introduced.  Simulation  results  are  given  in  Subsection-IV  to  verify  the 
performance  of  C-MLE. 

II.  The  Maximum  Likelihood  Problem  and  a  Brief  Overview  of  MLE 


The  observed  samples  of  a  complex  multiple  exponential  signal  can  be  represented  as 
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where,  wk,  ck  and  4>k  are  the  unknown  angular  frequency,  amplitude  and  phase,  respectively,  of  the  k,h  sinusoid; 
p  is  the  assumed  number  of  sinusoids  and  z(n)  represents  i.i.d.  N( 0,  a2)  Gaussian  noise  samples.  For  this  signal 
model,  the  MLE  corresponds  to  optimization  of  the  following  error  criterion  [1-4]  : 
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ak  A  cte7^*,  for  k  =  1,2, ...,p,  respectively,  are  the  complex  amplitudes.  The  MLE  problem  stated  in  (2)  is 
a  nonlinear  optimization  problem  with  respect  to  the  angular  frequencies.  Instead,  MLE  forms  an  alternative 
but  equivalent  error  criterion  in  the  polynomial  coefficient  domain  which  has  a  quasi-linear  structure  which  is 
well-suited  for  iterative  optimization.  A  brief  summary  of  the  MLE  criterion  is  in  order. 

Let,  B(z)  A  6o  4-  61z~1  +  ...  +  bpz~p,  be  a  pih  degree  z-polynomial  with  p  roots  at  e7"1,  eJUIJ  ... 
e7"',  respectively,  and  b  A  [60  6i  •••  bp]T  be  the  coefficient  vector.  The  MLE  criterion  for  estimating  b  is 

m-M  : 

min  E( b)  =  b^X^BB^j^Xb  where,  (4) 
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The  criterion  in  (4)  appears  to  be  quadratic  in  b,  except  that  the  weight  matrix  itself  depends  on  the  unknown 
coefficients.  Hence,  this  criterion  is  minimized  iteratively.  At  the  (k  -  l)-th  iteration 

min  b"[X"rB(‘-l)Ba(fc-1V1X]b  (6) 

D 

is  optimized,  where  the  weight  matrix  (BBa)  is  formed  using  the  estimate  of  b  found  at  the  previous  iteration. 
At  convergence  of  these  iterations,  the  frequencies  are  found  from  the  roots  of  the  estimated  polynomial  B(z). 
Unfortunately,  direct  optimization  of  the  criterion  in  (4)  does  not  guarantee  that  the  roots  of  B(z)  will  indeed 
fall  on  the  unit  circle  and  it  was  recognized  in  [1,  3]  that  two  conditions,  must  be  satisfied  to  guarantee  unit  circle 
roots  : 


Cl  :  The  coefficients  possess  conjugate  symmetry  : 

bk  =  for,  k  =  0,1,..., p,  and, 

C2  :  For  p  >  1,  the  derivative  of  B(z),  i.e., 

B-M  A  f& 

must  have  zeros  either  inside  or  on  the  unit  circle. 
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The  polynomial  domain  MLE,  as  originally  proposed,  imposes  the  conjugate  symmetry  constraint  only  [1,  2]. 
C2  makes  the  optimization  problem  highly  nonlinear  and  the  weighted-quadratic  structure  of  (4)  is  lost  if  C2  is 
incorporated  in  the  algorithm.  Hence,  no  attempt  was  made  in  [1-4]  to  include  C2  in  the  algorithm.  But  if  p  >  1, 
Cl  is  not  a  sufficient  condition  for  unit  circle  roots.  The  same  condition  may,  in  fact,  lead  to  roots  in  reciprocal 
pairs  which  can  and  does  occur  in  MLE,  especially  at  low  SNR.  In  such  cases,  two  closely  spaced  frequencies  are 
estimated  as  a  single  frequency  only  [1,  3]. 


Important  Observation  :  For  p  =  1,  the  conjugate  symmetry  alone  is  a  sufficient  condition  to  ensure  unit- 
circle  root.  Hence,  we  propose  to  impose  Cl  sequentially  on  each  lst-order  factor  of  B(z)  during  optimization  of 
(4).  In  that  case,  the  optimization  at  each  step  will  be  with  respect  to  only  a  lst-order  factor  of  B(z)  and  hence, 
there  is  no  need  for  satisfying  C2. 


Ill  :  Constrained  MLE  (C-MLE) 


The  p-th  order  polynomial  B(z)  can  be  expressed  in  factored  form  as  : 

B(z)  =  flO’-,')(z)fl(<>(z),  (9) 

where,  B^p~l\z)  A  6oP-^  +  6iP-,*z-'  +  ...  +  6j,p_-,'^z-p+1  and  B^’^(z)  A  6^  -I-  t^z-1,  are  (p  -  1)- 
th  order  and  lst-order  factors,  respectively.  If  conjugate  symmetry  is  imposed  on  the  1st  order  factor,  then, 
fi(*)(z)  =  +  fc;(,)z_1.  Note  that  in  (9)  the  coefficients  of  the  polynomial  B(z)  are  formed  as  the  convolution 

of  the  coefficients  of  fl(P_,)(z)  and  B^(z).  Hence,  in  matrix-vector  notation  : 


/6(r°  o  \ 


b  = 


6(p-) 


0 


b(.P-i) 

°  1  -  4<‘> 


evv 


Bp-jCbj, 


(10) 
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where,  Bp_,  denotes  the  matrix-factor  with  the  i-th  l-st  order  factor  removed  and  A  b$  +  jb^K  Using 
(10)  in  (6),  each  lst-order  factor  of  B(z)  is  estimated  by  optimizing, 


min  bi[C*B"  i(‘_1)Xif(B<‘-1)Bl,(‘'1))-1XB(*_-1)C]bi,  for,  .  =  1,2, . .  ,,p. 
*>• 


This  is  a  weighted-quadratic  criterion  of  the  form  : 

b?W^**bj  where, 


(11) 


(12a) 


wj*7l)  A  C"B"_,(t  1)Xjr(B(t-1)B"(t  1))-1XB^1)C  and  b,  A  .  (126) 

Note  that  the  weight  matrix  W^~ 1  *  is  formed  with  the  estimates  found  at  (k  -  1)  -  th  iteration  step  when  the 
unconstrained  MLE  algorithm  is  assumed  to  have  converged.  The  criterion  in  (11)  can  be  optimized  sequentially 
or  concurrently  for  each  first  order  factor.  At  each  iteration,  b,  is  estimated  as  the  eigenvector  corresponding 
to  the  minimum  eigenvalue  of  1  *  €  BJx5.  The  advantage  of  using  (12a)  instead  of  (6)  is  that,  since  each 
B('\z)  is  a  first-order  z-polynomial,  the  conjugate  symmetry  constraint  is  sufficient  to  guarantee  the  root  of 
£(')(;)  to  fall  on  the  unit  circle.  In  practice,  the  alternate  optimization  procedure  in  (11)  need  not  be  carried  out 
for  all  the  p  factors  of  B(z).  It  needs  to  be  invoked  only  in  those  cases  for  which  unconstrained  MLE  produces 
Merged  frequency  estimates.  The  roots  which  are  already  on  the  unit  circle  need  not  be  optimized  further.  This 
sequential  process  guarantees  that  all  the  roots  of  B(z)  will  indeed  fall  on  the  unit  circle. 


IV.  Simulation  Results 


The  algorithm  described  in  this  Section  has  been  tested  with  the  same  simulated  data  set  used  in  [1]  and  [2]. 
The  following  formula  was  used  to  generate  the  data, 


*(n)  =  aie>w'n  +  +  z(n) 

n  =  0,  1,  . . . ,  24 


(13) 


where,  wi  =  2t/i,wj  =  2r/j,  f\  and  /2  being  0.52  and  0.50,  respectively,  =  1,  a2  =  e7  i ,  z(n)  is  a  computer 
generated  white  zero-mean,  complex  gaussianly  distributed  noise  sequence  with  variance  =  <r2,  i.e.,  ^  is  the 
variance  of  the  real  and  the  imaginary  parts  of  z(n).  SNR  is  defined  as,  10  log10(^~).  Two  hundred  data  sets 
with  independent  noise  epochs  were  used. 

Fig.  la  and  lb  show  the  estimated  roots  for  200  independent  trials  of  MLE  for  SNR  =  5dB  and  lOdB, 
respectively.  Fig.  Id  and  le  show  the  corresponding  results  with  C-MLE.  For  the  lOdB  case,  Figures  lc  and  If 
show  only  the  merged  cases  before  after  applying  the  exact  constraints.  The  unit  circle  roots  in  Fig.  If  does  show 
wider  spread  than  the  corresponding  merged  frequency  estimates  in  Fig.  lc.  Fig.  2  compares  the  performance  of 
MLE  and  C-MLE  with  the  theoretical  CR  bound.  The  results  verify  that  C-MLE  performs  better  than  original 
MLE  at  low  SNR  range.  The  performance  of  C-MLE  has  also  been  compared  with  that  of  the  AP  method  [13]  and 
the  results  are  displayed  in  Fig.  3.  Clearly,  the  proposed  method  outperforms  the  AP  method  for  this  example, 
especially  at  low  SNR. 
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ESTIMATES  USING  MLE  ESTIMATES  USING  C-MLE 


Fig.  1  :  Superimposed  plots  of  estimated  roots  for  200  independent  trials  using  MLE  (a-c)  and 
C-MLE  (d-f). 


Comparison  of  Performance 


Performance  comparison  of  MLE  and  C-MLE  with  the  theoretical  CR-bound.  200 
independent  trials  were  used. 


Performance  comparison  of  C-MLE  and  AP  methods  with  the  theoretical  CR-bound. 
200  independent  trials  were  used. 


Section  •  2.3  :  IMPROVED  AR-PARAMETER  ESTIMATION  FROM  NOISY  OBSERVATION  DATA 
Summary 

Auto-Regressive  (AR)  modeling  is  the  most  widely  used  approach  for  model-based  spectrum  estimation. 
But  almost  all  the  existing  methods  for  AR-parameter  estimation  show  severe  degradation  if  the  observed  signal 
is  corrupted  with  noise.  In  fact,  all  the  commonly  used  techniques,  such  as,  Autocorrelation  Method  (AM), 
Covariance  Method  (CM),  Modified  Covariance  Method  and  their  variations,  give  poor  Power  Spectral  Density 
(PSD)  estimates  when  the  observations  are  noisy.  In  this  Section,  a  data-adaptive  pre-filtenng  approach  is 
presented  to  address  this  problem.  Preliminary  results  indicate  that  when  only  noisy  data  is  available  for  modeling, 
the  proposed  technique  gives  more  accurate  PSD  estimates  than  the  commonly  used  methods. 


INTRODUCTION 

Auto-regressive  (AR)  modeling  continues  to  play  a  very  important  role  in  model-based  spectral  estimation 
[1-4].  A  major  reason  for  the  wide  appeal  of  Auto-regressive  (AR)  modeling  is  its  computational  simplicity. 
Specifically,  the  standard  AR-methods  such  as,  Covariance  method  or  Autocorrelation  method  or  their  variations 
only  need  to  solve  a  set  of  linear  equations.  Furthermore,  in  estimating  ARMA  or  MA  models,  AR-parameter 
estimation  is  a  necessary  intermediate  step  [1].  But  there  remains  a  fundamental  problem  with  most  AR-modeling 
methods  and  that  is  with  regards  to  the  sensitivity  of  the  AR  spectral  estimators  to  observation  noise.  Noisy 
observation  samples  are  indeed  very  common  in  practice,  and  the  performance  of  the  existing  estimators  deteri¬ 
orate  drastically  in  such  cases.  There  have  been  some  previous  attempts  to  address  this  problem.  AR-mode)  in 
noise  being  a  special  type  of  ARMA  model,  this  property  has  been  used  in  [8,  9],  but  this  makes  the  estimation 
problem  highly  nonlinear.  Another  suggested  solution  has  been  to  model  the  process  as  large-order  AR  model 
so  as  to  reduce  the  estimation  bias,  but  this  may  lead  to  spurious  peaks  if  the  chosen  model  order  is  too  high 
[11].  Other  methods  suggest  noise  compensation  to  remove  the  bias  but  this  requires  prior  information  about  the 
observation  noise  [10]. 

The  main  goal  of  this  Section  is  to  utilize  certain  data-prefiltering  ideas  which  have  been  found  to  be  highly 
effective  in  estimating  sinusoidal  frequencies  from  noisy  data  [5,  6]  and  also  for  identifying  deterministic  systems 
from  Input-Output  data  [12]  and  Impulse  Response  Data  [13].  It  is  well-known  that  a  sinusoidal  process  can  be 
viewed  as  a  limiting  case  of  a  narrowband  AR-process.  Indeed,  the  peak  locations  of  AR-spectra  are  commonly 
used  as  the  estimates  of  frequencies  [1].  But  the  poor  performance  of  AR-methods  with  noisy  data  also  causes 
inferior  frequency  estimates  at  low  SNR.  In  order  to  alleviate  this  problem,  a  large  class  of  methods  based  on 
principal-component  (PC)  analysis,  have  been  developed  for  reducing  the  effect  of  noise  in  data  [3,  7].  But  the 
PC-based  methods,  though  highly  effective  for  tone-frequency  estimation,  can  not  be  used  for  cleansing  noisy 
AR-data.  This  is  because  the  data  and  correlation  matrices  are  theoretically  full-rank  in  this  case  even  when 
there  is  no  observation  noise  at  all.  A  new  class  of  algorithms,  referred  to  as  KiSS  or  IQML,  have  been  developed 
recently  for  Maximum-Likelihood  frequency  estimation  [5,  6].  The  KiSS  algorithm  essentially  prefilters  the  noisy 
data  by  iteratively  minimizing  the  projection  of  the  observations  onto  the  noise  subspace  formed  with  linear 
predictor  type  polynomial  coefficients.  It  is  shown  in  this  work  that  this  matrix-prefiltering  approach  also  has  the 
desired  noise-reduction  effect  on  pure-AR-in-noise  data.  Extensive  simulation  studies  indicate  that  the  proposed 
prefiltering  produces  more  accurate  AR-spectra  than  the  conventional  AR-modeling  approaches. 

Formulation  of  Data-Adapttve  Prefiltering 
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The  proposed  approach  may  be  best  explained  by  outlining  the  initialization  step  and  the  noise-subspace 
projection  utilized  by  the  KiSS-IQML  algorithm  [5,  6].  In  that  algorithm,  the  frequency  estimation  problem  is 
essentially  transformed  into  an  AR-type  polynomial  estimation  problem.  Specifically,  let, 


B{z)  A  bo  +  b\2  1  +  ...  +  brz  f  (1) 

be  a pth  degree  z-polynomial  with  roots  at  e*w» ,  e*U3  ...  respectively.  The  coefficient  vector, 

b  A  [to  hi  •  ••  bp]T  (2) 

is  estimated  by  minimization  of  the  following  error  criterion  : 

min  bHX//(BBH)~1Xb  where,  (3a) 


'bp  ...  bo 


B  A 


X  A 


and 
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*(p) 
z{p+  1) 


bp  ...  y 
x(p  -  1) 
*(p) 


x(0) 

x(l) 


^x(N-l)  j  x(N-2)  ...  x(N-p-l)/ 
A  (g  |  G). 


(36) 


(3c) 


(3d) 


The  weighted-quadratic  structure  in  (3a)  is  utilized  for  minimizing  the  criterion  iteratively.  At  the  (*'  +  l)-th 
iteration,  the  weight  matrix  (BBW)  is  formed  with  the  estimate  of  b  found  at  the  i-th  iteration  and  the  following 
criterion  is  minimized  to  obtain  the  updated  estimate  : 


min  bH[XH(B(<)BH(<V1X]b. 


(4) 


The  iterative  algorithm  in  (4)  is  initialized  with,  b  =  [1  0  ...  0]T.  Hence,  the  initial  estimator  has  the  following 
form  : 


min  bHXHXb  (5a) 

b 

=  min  ||Xb||2.  (56) 

b 

Interestingly,  this  criterion  is  exactly  identical  to  the  ‘Covariance  Method’  of  linear  prediction  used  in  AR-modeling 
[4].  If  the  data  contains  no  observation  noise,  the  minimization  in  (5)  would  indeed  produce  exact  frequency 
estimates.  Furthermore,  the  performance  of  covariance  method  for  modeling  pure  AR-processes  without  any 
observation  noise  is  also  known  to  be  quite  good  [4j.  But  the  performance  deteriorates  drastically  with  noisy 
observation  data.  In  fact,  simulations  indicate  that  even  at  reasonably  high  SNR  of  30-35  dB,  Covariance  (or 
Autocorrelation)  method  may  not  be  able  to  distinguish  closely  spaced  peaks  or  frequencies  in  the  underlying 
process. 

In  order  to  improve  on  the  initial  estimate  obtained  using  (5),  the  criterion  in  (4)  is  iteratively  minimized  in 
KiSS/IQML.  But,  the  original  criterion  in  (3a)  also  has  the  following  equivalent  forms  : 

b#Xff(BB")-1Xb  =  b"x"(BB")"1BBi/(BB")-1Xb  (6a) 


41 


=  x"B"(BB"r,BBw(BD")"1Bx  (6*) 

=  x^PbuPbux  (6c) 

=  I|Pb»*II3  (M) 

=  ||Bh(BB")-‘Bx||j  (6c) 

=  ||WBx||*  (6/) 

=  l|WXb||2,  (6S) 


where,  P bh  A  BH(BBW)-1B  denotes  the  ‘projection  matrix’  of  Bw, 


W  A  Bh(BBh)~1  (7a) 

is  a  weighting  matrix  and 

x  A  [*(0)  x(l)  ...  x(N-l)]T),  (7b) 

is  the  observation  vector. 

Equation  (6d)  shows  that,  in  order  to  reduce  the  effect  of  noise  in  this  AR-type  parameter  estimates,  the 
projection  of  the  data  (x)  onto  the  column-space  of  the  BH  matrix  needs  to  be  minimized.  The  criterion 
in  equation  (6g)  is  similar  to  the  criterion  in  (5b)  for  Covariance  method,  except  that  in  (6g)  the  projection 
operation  essentially  prefilters  the  data  matrix  X  by  the  weight  matrix  W  which  is  formed  by  the  coefficients 
estimated  at  the  previous  iteration  step.  The  most  obvious  conclusion  from  this  discussion  is  that  the  noise- 
suppression  capability  of  KiSS-IQML  is  essentially  due  to  this  prefiltering  of  the  data-matrix  (X)  which  appears 
in  conventional  Covariance  method  for  AR  modeling. 

As  mentioned  before,  multiple  sinusoids  can  be  modeled  as  a  limiting  case  of  narrowband  AR-process  [1].  The 
analogies  noted  above  appears  to  lead  to  the  possible  hypothesis  that  similar  prefiltering  operation  may  be  equally 
effective  in  reducing  noise  effects  on  the  AR  parameter  estimates  also,  especially  for  narrowband  AR-processes. 
The  algorithm  outlined  next  essentially  minimizes  the  projection  of  the  data  onto  the  column-space  of  BH  in 
order  to  obtain  improved  estimates  of  the  AR-parameter  vector  b. 


Steps  for  the  Proposed  prefiltering  algorithm 

1  :  Obtain  the  initial  estimate  of  the  AR-parameters  in  b  using  any  of  the  conventional  AR-modeling  methods. 

2  :  Form  the  B^')  matrix  defined  in  (3b)  using  the  estimate  of  b  found  in  the  previous  iteration. 

3  :  Minimize  the  criterion  (4b)  to  obtain  am  updated  estimate  of  b,  which  has  the  following  form  : 

b(<+1>  =  (  . 1 .  )  (8) 

\  -(G"(B('>Bw<'>)"1G)-1G"(B(i)BH<”)-1g  J 

where,  denotes  the  matrix  obtained  is  Step-2  whereas,  g  and  G  are  defined  in  (3d). 

4  :  Go  to  Step-2  unless  ||b(,+l)  -  b<*)||2  <  S,  where  6  is  a  small  number. 

An  important  difference  between  KiSS-IQML  algorithm  and  the  proposed  method  is  that  in  case  of  KiSS- 
IQML,  conjugate-symmetry  constraints  need  to  be  imposed  on  the  coefficients  of  the  £(z)-polynomial  in  an 
attempt  to  constrain  the  roots  to  lie  on  the  unit  circle.  This  makes  the  optimization  problem  even  more  nonlinear. 
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But  in  the  present  case  no  such  constraints  are  necessary  and  hence,  the  optimization  in  (4)  is  more  straight¬ 
forward.  Extensive  simulation  studies  have  shown  that  this  algorithm  does  produce  better  AR  spectrum  match 
at  lower  SNR  than  any  of  the  standard  AR-modeling  techniques. 

Simulation  Results 

The  test  data-set  given  in  Chapter-7  in  [3]  was  generated  for  the  simulation.  Fig.  1.  illustrates  the  per¬ 
formance  of  the  Covariance  method  for  50  independent  realizations  of  the  observation  data  at  30dB  SN  R.  The 
solid  line  in  Fig.  2  shows  the  average  of  estimated  spectra  of  the  50  realizations  and  the  dashed  line  shows 
the  true  spectrum.  Figures  3-4  and  5-6  show  the  corresponding  results  with  Modified  Covariance  Method  and 
Autocorrelation  Method,  respectively,  with  identical  data  sets.  The  results  clearly  demonstrate  that  even  at  this 
moderately  high  SNR,  none  of  these  commonly  used  methods  were  able  to  distinguish  the  two  spectral  peaks 
for  most  of  the  noise  realizations.  Fig.  7  shows  the  results  of  the  proposed  prefiltering  algorithm  for  those  50 
identical  realizations  at  the  same  SNR.  The  iterations  converged  in  6-8  iterations  in  all  cases.  Fig.  8.  shows  the 
average  of  the  50  realizations  with  the  true  spectrum.  This  improvement  was  found  to  be  consistent  even  at  lower 
SNR  values.  Similar  improvements  have  also  been  observed  when  the  Auto-correlation  method  and  the  Modified 
Covariance  method  were  used  to  generate  the  initial  AR  parameter  estimates.  The  plots  clearly  demonstrate 
that  the  proposed  method  was  able  to  match  the  AR-spectra  more  closely.  With  simulated  data,  the  average 
prediction  error  power  for  the  proposed  estimator  was  also  found  to  be  much  smaller  than  the  standard  methods. 
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Section  -  2.4  :  IMPROVED  ARMA-PARAMETER  ESTIMATION  FROM  NOISY  OBSERVATION  DATA 


SUMMARY 

Existing  methods  for  ARMA  modeling  assume  that  the  available  process  is  produced  by  an  ARMA  system 
driven  by  a  white  input  process,  i.e.,  the  observed  process  is  considered  to  be  pure  ARMA.  In  practice,  the 
available  data  usually  have  observation  noise  added  to  it  but  the  ARMA  methods  do  not  address  this  problem. 
Simulations  show  that  performance  of  the  existing  ARMA  methods  deteriorate  when  the  observation  process  is 
noisy.  In  this  Section  a  new  ARMA  algorithm  is  given  which  utilizes  a  recently  developed  deterministic  rational 
system  identification  method  (OM-IO)  [8]  that  minimizes  the  modeling  or  output  error  norm.  The  algorithm  first 
estimates  the  input  process  and  then  invokes  OM-IO  using  the  input-output  data.  Simulations  indicate  that  the 
proposed  method  is  quite  effective  even  at  low  SNR  observation  data. 

INTRODUCTION 

With  both  poles  and  zeroes,  ARMA  models  are  best  capable  of  effectively  representing  general  spectra  with 
possibly  sharp  peaks  as  well  as  deep  valleys.  Modeling  of  ARMA  processes  involves  solution  of  a  set  of  highly 
nonlinear  equations.  Existing  methods  divide  the  problem  into  several  ‘equation  error’  minimization  problems 
to  estimate  the  AR  and  MA  parameters  in  several  stages.  The  estimation  problem  is  further  complicated  if  the 
available  data  is  also  corrupted  with  observation  noise.  In  fact,  simulation  studies  indicate  that  the  performance  of 
the  existing  ARMA  modeling  methods  deteriorate  significantly  with  noisy  data.  This  drawback  may  be  attributed 
to  the  sensitivity  of  equation-error  minimization  based  methods  to  the  presence  of  noise.  In  this  Section,  we 
propose  to  address  this  problem  by  incorporating  a  recently  developed  optimal  algorithm  for  identification  of 
deterministic  ARMA  systems  [8]  into  the  stochastic  ARMA  modeling  problem.  Recent  results  indicate  that 
estimators  based  on  minimizing  model  ‘fitting-error’  have  superior  performance  when  compared  to  those  which 
rely  on  equation-error  minimization  [8,  13].  In  view  of  this,  unlike  existing  ARMA  methods,  the  algorithm 
presented  in  this  work  minimizes  output  or  modeling  errors.  The  results  obtained  so  far  indicate  that  the  proposed 
approach  is  much  more  effective  than  existing  methods  for  ARMA  parameter  estimation  when  the  available  data 
is  not  purely  ARMA  but  has  some  observation  noise  added  to  it. 

The  ARMA  Model 

An  ARMA(p,  q)  process  can  be  represented  in  a  linear  difference  equation  form  as, 


p  i 


x(n)  =  -  bkX(n  ~  + °*u(”  ~  *) 

(1) 

i=l  k=0 

where,  the  corresponding  z-domain  transfer  function  has  the  following  form  : 

rf(-A  a0  +  oiz-1  +  --  +a,z"«  A  A(z) 

1  +  b\z~l  +  &2Z-2  -1 - 1 -bpz~f  =  B(z) 

(2) 

Let, 

a  A  [ao  ai  •  •  •  a,f  and 

(3a) 

b  A  [1  6i  •  •  •  bp]T 

(36) 

denote  the  unknown  MA  and  AR  parameters,  respectively.  In  vector  form, 

x  A  (x(0)  x(l)  •••  x(N  -  1)]T  and 

(4a) 

u  A  [ti(0)  u(l)  •••  u(N  —  1)], 

(46) 
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denote  the  output  data  and  the  driving  noise  sequences,  respectively. 

Previous  Methods 

Estimation  of  the  Maximum-Likelihood  (ML)  parameters  of  an  ARMA  process  is  a  highly  nonlinear  problem. 
Akaike’s  MLE  method  requires  nonlinear  optimization  which  are  prone  to  poor  convergence  if  the  initial  estimates 
are  chosen  properly  [4].  To  overcome  the  complexities  of  MLE,  many  computationally  attractive  techniques  have 
been  developed  also.  Among  these,  the  Modified  Yule-Walker  equations  (MYWE)  method  estimates  the  AR 
parameters  from  the  tail-end  of  Yule- Walker  (Y-W)  equations,  i.e.,  from,  r„(fc),  k  =  g  +  1, . . .  ,g  +  p.  The  output 
process  is  then  filtered  by  the  estimated  A(z)  filter,  which  results  in  an  MA  process  from  which  the  MA  parameters 
can  be  determined  by  any  standard  procedure  for  MA  parameter  estimation  [9].  An  extension  of  this  approach, 
known  as  the  Least-Squares  MYWE  (LSMYWE)  [10],  uses  more  of  the  tail-end  of  Y-W  equations  and  yields 
better  results  than  MYWE. 

In  stochastic  ARMA  modeling,  the  driving  white  noise  sequence  is  completely  unknown.  Clearly,  if  it  were 
somehow  possible  to  have  some  estimate  of  the  driving  noise  u(n),  then  any  input-output  system  identification 
technique  could  be  used  to  estimate  the  ARMA  parameters.  Two  well-known  ARMA  methods  are  indeed  based 
on  this  principle,  namely,  Two-Stage  Least-Squares  [5]  and  Three-Stage  Least-Squares  [7,  12].  The  primary  steps 
in  these  methods  are  to  model  the  output  data  first  as  a  large  order  AR  process,  then  a  prediction  error  sequence 
is  obtained  by  passing  the  data  through  the  inverse  filter  which  is  MA.  This  whitened  prediction  error  sequence 
is  used  as  the  estimate  of  the  input  white  noise  sequence  u(n).  With  this  estimated  input  and  the  observed 
output,  the  ARMA  parameters  are  then  found  by  minimizing  equation  errors  in  two  [5]  or  three  stages  [7].  The 
three-stage  approach  has  been  shown  to  have  lower  variance  than  the  two-stage  case.  But,  as  will  be  shown  with 
simulations  below,  even  the  three-stage  algorithm  can  not  perform  well  when  the  observation  data  is  noisy,  which 
is  quite  possible  in  practical  situations. 

It  may  be  mentioned  here  that  in  [11]  a  data-adaptive  prefiltering  method  has  also  been  proposed  for  improved 
modeling  of  AR-parametera  from  noisy  observation  data.  As  noted  in  [11],  there  have  been  some  previous  work  on 
AR-modeling  from  noisy  data,  but  the  author’s  are  not  aware  of  any  such  work  for  modeling  ARMA  parameters 
from  noisy  data,  which  is  the  problem  considered  in  this  work. 

The  Proposed  Idea 

Instead  of  minimizing  the  equation  error  criterion  as  in  [5,  7],  the  proposed  algorithm  minimizes  the  modeling 
erroror  output  error  criteria.  This  is  also  a  nonlinear  problem,  but  a  recently  developed  input-output  identification 
method  optimally  decouples  the  numerator  and  denominator  problems  [8]  into  two  separate  problems  of  smaller 
dimensions.  The  decoupled  estimators  retain  the  global  optimum  of  the  original  criterion.  It  has  been  further 
shown  in  [8]  that  in  the  decoupled  form,  estimation  of  the  numerator  a  is  a  purely  linear  problem  whereas  the 
estimation  of  the  denominator  is  a  nonlinear  problem  of  reduced  dimensionality.  But  the  nonlinear  criterion  for 
the  denominator  possesses  a  convenient  weighted-quadratic  structure  which  can  be  easily  exploited  to  estimate  the 
denominator  iteratively.  Preliminary  simulation  studies  show  that  the  proposed  method  outperforms  the  existing 
ARMA  modeling  approaches  when  the  observed  data  is  corrupted  with  noise.  Brief  explanation  of  the  underlying 
theory  along  with  the  algorithm  steps  are  in  order.  Some  simulation  results  included  at  the  end  demonstrate  the 
superior  performance  of  the  proposed  method. 

Formulation  of  The  Estimation  Procedure 

Let, 

y(n)  =  x(n)  +  v(n),  (5) 
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be  the  observed  noisy  ARMA  process,  where  v(n)  denotes  the  observation  noise  process.  Let, 


y  A  (y(0)  y(l)  ■  ■  ■  y(N  -  1)]T 


(6) 


denote  the  noisy  output  vector.  Using  covariance  method,  this  observation  data  is  first  modeled  as  a  large  order 
(=L)  AR  model  to  obtain  an  AR-polynomial  BL(z)  such  that  L  »  p,  the  true  AR-order  of  the  underlying 
ARMA  process.  The  observation  sequence  y(n)  is  then  filtered  through  an  MA-filter  in  the  form  of  BL(z)  to 
obtain  a  whitened  prediction  error  sequence  u(n).  Then  using  u(n)  as  the  estimate  of  the  input  sequence,  the 
ARMA  modeling  problem  can  be  restated  as  the  following  output-error  minimization  problem  : 


min 

«,b 


N-l 
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wo  -  ^{«(‘)}]I 


(?) 


It  has  been  shown  in  [8]  that  the  above  nonlinear  problem  can  be  decoupled  into  a  purely  linear  problem  to 
estimate  a  and  a  nonlinear  problem  for  b.  Such  decoupling  techniques  have  also  been  found  to  be  very  effective 
in  Maximum  Likelihood  estimation  of  the  parameters  of  multiple  exponential  models  [2,  3].  It  may  be  noted  here 
that  the  estimators  in  [6,  7]  utilize  the  estimated  prediction  error  sequence  u(n).  But  those  estimators  do  not 
minimize  the  true  model-fitting  defined  in  [7],  but  are  based  on  minimizing  equation  error  norms.  The  following 
definitions  are  necessary  to  formulate  the  decoupling  of  the  numerator  and  denominator  optimization  problems. 

Let  Hb(z)  be  an  inverse  filter  corresponding  to  B(z),  i.e., 


B(z)Hb(z)  =  1. 


(8) 


By  writing  this  convolution  in  matrix-form,  it  can  be  shown  that  [8], 
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which  leads  to  the  conclusion  that  Br  is  orthogonal  to  the  matrix  H».  Utilizing  this  orthogonality  relationship, 
the  optimal  criterion  for  estimating  the  denominator  cam  be  shown  to  be  [8]  : 


min  yHUfB(BHU/UfB)-1B"U/y 
t> 


=  min  b"Z'1(B"U/U?B)-‘Zb 

b 

where,  U  is  a  lower-triangular  convolution  matrix  formed  with  the  estimated  input  sequence  tk(n), 
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The  matrix  U/  is  the  inverse  of  U  and  is  also  lower  triangular. 


(116) 
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z  A  U/y  and 
Z  A  B az, 


where,  the  matrix  Z  has  the  following  Toeplitz  structure, 
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where,  g  is  a  column  vector  formed  with  the  leading  column  of  Z.  The  denominator  vector  b  is  estimated 
by  minimizing  the  criterion  in  (10).  The  minimization  is  performed  iteratively  by  forming  the  weight-matrix 
(BHU/Uj*B)  with  the  estimate  of  b  obtained  at  the  previous  iteration  step.  At  convergence  of  the  iterations, 
the  estimated  b  is  used  to  form  the  matrix  H)  using  its  inverse  sequence  as  in  (8).  Then  the  least-squares  solution 
of  the  numerator  a  is  found  as, 

a  A  (UHj)#y  (12) 

where,  #  denotes  matrix  pseudo-inverse.  The  iterative  process  is  initialized  by  estimates  obtained  by  minimizing 
equation  errors  as  in  (6,  7,  10].  Hence,  the  further  iterations  of  the  proposed  method  can  only  improve  upon  the 
equation-error  based  estimates  because  it  minimizes  the  true  modeling  error  criterion  defined  in  (7). 


The  Overall  algorithm  in  Brief 


The  complete  algorithm  for  the  ARMA  parameter  identification  can  be  summarized  as  the  following  four 
primary  steps  : 


1.  Model  the  observed  sequence  y  by  a  large  order  AR  model. 

2.  Determine  the  prediction  error  white  noise  sequence  u,  which  is  treated  as  the  input  sequence  for  the  system. 

3.  Knowing  u  and  y,  start  the  iterative  procedure  to  minimize  the  error  criterion  in  (10).  At  each  iteration, 
b  is  estimated  either  as  the  eigenvector  corresponding  to  the  minimum  eigenvalue  of  Ztf(BtfU/Uf  B)-1Z 
or  by  setting  6q  =  1.  In  the  later  case,  the  estimate  of  b  at  the  (t  +  l)-th  iteration  is  obtained  using  the 
estimates  of  the  previous  iteration  step  as  follows, 


b«+D 
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where,  the  matrix  W  is  formed  with  the  estimates  of  b  at  the  previous  iteration  step  as, 


W  A  UftB^U/UfB)-1.  (136) 

The  iterations  we  continued  till  convergence  is  reached,  i.e.,  no  significant  change  is  found  in  b  between 
successive  iterations. 


4.  Estimate  a  using  (12). 
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Simulation  Results 


The  simulations  were  performed  with  the  test  data  set  (ARMA4)  used  in  Chapter-10  of  [1],  The  true  system 
PSD  has  two  prominent  peaks  as  shown  in  Fig.  1.  The  PSD  estimates  at  an  SNR  of  20dB  in  the  observation 
data  y(n)  are  shown  in  Fig.  2.  The  results  using  MYWE,  LSMYWE,  Maine- Firoozan  and  the  proposed  method 
are  shown  in  Figures  2a  through  2d.  Clearly  the  proposed  method  performs  better  than  the  other  three.  The 
corresponding  results  with  15dB  SNR  are  shown  in  Fig.  3a  -  3d.  The  performance  of  the  proposed  method 
is  maintained  even  at  this  level  of  SNR,  though  the  results  with  the  three  existing  methods  have  deteriorated. 
Further  simulations  at  lower  SNR  levels  indicate  that  the  peaky  spectral  shape  is  maintained  at  least  up  to  12dB. 
The  efficacy  of  the  algorithm  is  obvious. 
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Section  -  2.5  :  TIME-DOMAIN  DETECTION  OF  ELECTRONIC  WARFARE  SIGNALS  IN  NOISE 


SUMMARY 

In  the  passive  mode  of  operations  of  EW  applications,  source  signals  may  not  be  present  within  a  given 
observation  window,  or  the  signals  may  fill  only  a  part  of  the  estimation  window.  In  either  case,  any  frequency 
estimation  algorithm  may  produce  erroneous  or  noise  frequencies.  Considering  the  relatively  high  computational 
burden  of  any  frequency  estimation  method,  it  is  desirable  to  invoke  a  frequency  estimation  method  only  when 
a  detection  scheme  indicates  high  probability  of  presence  of  threat.  In  this  Section,  the  theory  of  detection  of 
sinusoids  from  Quantized  and  Noisy  time-domain  observation  samples  have  been  developed.  The  theoretical  work 
on  single/multiple  samples  is  mostly  complete.  Studies  with  Quantized  data  have  also  been  performed  and  the 
results  appear  reasonably  good.  Lab  tests  for  the  Envelope  Detection  and  Square-Law  cases  have  been  conducted 
at  Wright  Labs  with  satisfactory  results. 

I.  Introduction  : 

In  Electronic  Warfare  (EW)  environments,  microwave  receivers  play  a  major  role  in  passive  identification  and 
localization  of  unknown  targets  emitting  high-frequency  electro-magnetic  signals.  EW  signals  cover  a  relatively 
wide  bandwidth,  typically  in  the  range  of  0.2  to  15  GHz,  and  existing  microwave  receivers  utilize  mostly  analog 
signal  processing  tools  and  techniques  [1-3].  In  fact,  there  are  no  EW  receivers  that  process  microwave  radar  signals 
entirely  in  the  digital  domain.  With  the  emergence  of  increasingly  faster  and  inexpensive  digital  computers  and 
high-speed  A/D  converters,  it  is  expected  that  digital  processing  of  microwave  radar  signals  is  expected  to  be 
practically  feasible. 

The  primary  task  of  a  microwave  receiver  is  to  gather  data  for  sorting  of  signals  and  identification  of  the  type 
of  the  radar  emitting  the  received  signal.  Based  on  these  information,  jamming,  weapon  delivery  or  other  options 
are  considered.  In  order  to  perform  these  tasks,  the  receiver  must  analyze  the  received  radar  pulses  and  measure  or 
estimate  the  following  six  parameters  :  Angle-of-Arrival  (AOA),  Radio  Frequency  (RF),  Time  of  Arrival  (TOA), 
Pulse  Amplitude  (PA),  Pulse  Width  (PW)  and  Polarization  (P).  But  in  order  to  reduce  computational  burden,  the 
estimation  of  these  parameters  should  be  undertaken  only  when  it  is  determined  that  there  is  a  high  probability 
of  the  presence  of  a  threat  signal. 

In  this  part  of  the  project,  the  detection  problem  has  been  considered  in  the  time-domain  for  single  and 
multiple  samples.  Detection  thresholds  and  Probability  of  Detection  based  on  Neyman-Pearson  Criterion  have 
been  derived.  Derivations  are  given  for  calculating  the  Thresholds  and  Probability  of  Detection  for  both  the 
‘Square-Law’  and  ‘Envelope’  detectors. 

II.  Time-Domain  Detection  : 

Almost  all  existing  AOA/RF  estimation  algorithms  assume  that  the  signal  is  already  present  in  the  observed 
data.  But  in  the  passive  mode  of  operations  of  EW  applications,  source  signals  may  not  be  present  at  all  within 
the  observation  window,  or  the  signals  may  fill  only  a  part  of  the  estimation  window.  In  either  case,  any  frequency 
estimation  algorithm  would  essentially  produce  erroneous  or  noise  frequencies  because  the  observed  signal  would 
not  satisfy  the  model  assumed  by  the  estimation  algorithm.  Considering  the  relatively  high  computational  burden, 
any  estimation  method  should  be  invoked  only  when  a  detection  scheme  indicates  high  probability  of  presence  of 
threat. 

Since  EW  receivers  do  not  have  any  prior  knowledge  about  the  frequency /amplitude/phase  of  the  received 
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signals,  conventional  matched  fitters  can  not  be  used  in  this  case.  An  obvious  solution  would  be  to  perform  the 
detection  in  the  frequency-domain,  i.e.,  the  presence  of  targets  can  be  determined  by  thresholding  of  FFT-peaks. 
The  frequency-domain  approaches  are  robust  but  have  certain  disadvantages  in  that  a  decision  can  be  made  only 
after  a  block  of  data  has  been  collected.  Furthermore,  a  lot  of  computational  power  may  be  wasted  if  FFT  is  taken 
continuously,  even  when  no  target  is  present.  Instead,  we  plan  to  incorporate  a  time-domain  detection  scheme 
that  can  detect  targets  in  real-time  using  a  single  observation  or  a  small  number  of  samples.  Once  a  preliminary 
decision  is  taken,  FFT  or  more  sophisticated  frequency/AOA  estimation  algorithm  can  be  invoked,  if  desired. 

n.l  :  Signal  and  Noise  Model 

Microwave  radars  signals  can  be  modeled  as, 

*(n)  =  Acoa(uck  +  6)  +  n(fc)  (la) 

=  A  coe(uck)  cos  $  -  A  sin(uck)  sin  6  +  n/(k)cos(uick)  +  nq(k)  sin(wei)  (16) 

where,  n(k)  denotes  narrowband  noise  samples.  To  perform  the  time-domain  detection,  the  received  real  data 
is  first  converted  into  a  complex  analytic  signal.  This  is  achieved  by  passing  the  real  signal  through  a  Hilbert 
lYansformer  to  form  the  in-phase  (I)  and  quadrature  (Q)  components  of  the  complex  analytic  signal.  When  no 
signal  is  present,  the  I  and  Q  components  may  be  represented  as, 

Xr(k)  =  n,(k) 

XQ(k)  =  nq(k) 

On  the  other  hand,  in  the  presence  of  signal,  the  corresponding  components  are  given  as  : 

Xj{k)  =  Acoe#  +  n/(k)  (3a) 

Xq(k)  =  .4  sin#  +  nq(k).  (3b) 

Since  the  amplitude,  frequency  and  the  phase  of  the  received  signal  are  unknown,  the  detection  criterion  has 
to  rely  on  thresholding  of  the  amplitude  (PA)  of  the  analytic  signal.  The  frequency  and  phase  can  be  ignored 
for  detecting  only  the  presence  of  a  target  signal.  The  amplitude  threshold  can  not  be  based  on  minimizing  the 
total  probability  of  error  because  the  exact  amplitude  of  the  signal  is  unknown  at  the  receiver.  Furthermore,  the 

probability  of  False  Alarm  ( Pfa )  must  also  be  kept  very  low  (10-6  or  smaller).  Hence,  the  best  detection  scheme 

would  be  to  calculate  the  threshold  by  setting  the  Pfa  to  a  constant.  The  thresholds  for  Square-Law  detector 
have  been  derived  next  for  single  and  multiple  samples  within  a  pulse. 

II. 2  :  Square  Law  Detector 

The  noise  is  assumed  to  be  narrowband  and  Gaussianly  distributed  with  zero-mean  and  variance  =  a2. 
Hence,  for  the  no-signal  case  of  (2)  the  I/Q  noise  samples  are  distributed  as  : 

*/(*)  =  N(0  ,tr2)  (4a) 

*<?(*)  =  Af(0,<r2).  (46) 

In  the  following  derivation,  the  time-variable  k  will  be  suppressed  until  the  multiple  samples  case  is  considered. 

II.2.a  :  Single  Sample  Case 


(2«) 

(26) 
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Assuming  independent  noise  samples,  when  no  signal  is  present,  the  joint  probability  density  function  (PDF) 
of  the  I/Q  channel  outputs  are  given  by  : 


f(Xi,Xq) 


2*<r2 


+  *q). 


(5) 


Let, 


Xi  =  A  cos  a  (6a) 

Xq  —  Asina.  (66) 

Using  the  Jacobian  of  this  transformation,  the  joint  PDF  for  this  polar  form  can  be  shown  to  be  : 

/(»■,<*)  =  2^2e~^u(r)'  (7) 

From  this  the  marginal  for  the  Envelope  (R)  is  given  by, 

/«(»•)  =  Jo  f(r,a)da  =  (r),  (8) 

which  is  known  as  the  Rayleigh  PDF. 

II.2.a.l  :  The  PDF  and  Characteristic  Function  with  No  Signal 
A  square-law  detector  forms  the  following  quantity, 

Z  A  X)  +  X2  =  A2  (9) 

which  needs  to  be  compared  to  a  threshold  to  decide  the  presence/absence  of  a  radar  target.  Since, 
jfc  =  2A  =  2 y/Z,  the  PDF  of  the  Square-Law  output  when  no  signal  (denoted  as,  a)  is  given  as  : 

'*«*>  “  ^  do) 

which  is  the  Exponential  PDF.  The  Characteristic  Function  (CF)  is  defined  as  the  Fourier  TYansform  of  the 
Density  function  : 


C|(u-)  A  T[fz(z\i)) 

_  i 
1  +  j  2w03 

II.2.a.2  :  The  PDF  and  Characteristic  Function  in  Presence  of  Signal 
When  target  is  present,  t.e.,  in  case  of  (3),  the  I/Q  samples  sure  distributed  as  : 

X/(k)  =  N(Acoe0,o2) 

Xq(k)  =  JV(Aco6  0,<r2). 


(11) 
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(12a) 

(126) 


In  this  case,  the  joint  probability  density  function  (PDF)  is  given  by  : 


f(x,,xQ)  =  2^5e~*iTl(*'  ‘  Aa*,)3  +  (*°  "  (13) 

Once  again,  using  the  Jacobian  of  the  transformation,  the  polar-form  joint  PDF  can  be  shown  to  be  : 

/<r,a|.)  =  +  A 3  ’  Mrc°^“  '  #)1«(r).  (14) 

Integrating  over  a,  the  marginal  PDF  of  the  Envelope  is  given  by, 

f2’ 

/fi(r|s)  =  /  /(r,or|s)do  (15a) 

J  o 

= 2^e~*(A' + f,)  r " ,)da'  (i5A) 

-  y** + '•>'”($)  <**> 

where,  io(-)  denotes  Bessel  Function  of  the  zero-th  kind.  The  PDF  in  (15c)  is  known  as  the  Rician. 

Similar  to  the  no-signal  case  in  (9)-(10),  the  PDF  of  the  Square- Law  output  Z  with  signal-plus-noise  is  given 

h(2\s)  -  (16) 

In  this  case,  the  Characteristic  Function  can  be  found  as  follows  : 

c*z(W)  a  n/*w«)j 

=  <17> 

The  following  Fourier  Transform  pair  can  be  found  in  [CF,  page-79,  pair  655.1] : 


— I — e  *<-'**  >). 

w  +  p 


Using  (18)  and  with  appropriate  substitutions,  Cjj(w)  is  given  by  : 


Cz(w)  =  - - — -e~ 1  +  >a-»a 

1  +  j2w(rJ 


II.2.a.3  :  The  Neyman-Pearson  Criterion  with  a  Single  Sample  : 

For  this  one-dimensional  case,  the  decision  that  the  signal  is  present  is  taken  if  the  likelihood-ratio  [17]  : 

where  k  is  a  constant  that  depends  on  the  probability  of  False- Alarm  Pfa  ■  From  this  relationship  it  may  appear 
that  in  order  to  find  the  decision  threshold,  one  would  need  to  know  or  estimate  the  signal.  But  one  of  the  most 
attractive  consequence  of  Neyman-Pearson  criterion  is  that  for  a  given  predetermined  Pfa ,  the  threshold  can  be 
set  by  integrating  /(z|«)  over  the  region  where  the  signal  is  present  [11,  17]. 
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11.2. a.4  :  Probability  of  False  Alarm 

If  the  threshold  is  denoted  as  7,  the  false-alarm  probability  can  be  calculated  as, 

Pfa  =  [  fz{z\»)dz 

=  2^3  J  dz  from  (10), 

=  e~&  (21) 

11.2. a.5  :  Detection  Threshold 

Taking  natural  logarithm  of  both  sides  of  (21),  the  detection  threshold  is  given  as, 

7  =  -2<r3ln  PXFA.  (22) 


II.2.a.6  :  Probability  of  Detection 

If  the  square-law  output  2  of  is  greater  than  7  from  (22),  then  the  decision  is  taken  that  source  target  is 
present.  Hence  the  probability  of  detection  can  be  calculated  from  : 

pb  =  f  fz(z\s)dz 

_ 

=  1  -  J  fz(z)s)dz 

= '  -  h(*£)iz  (23> 
By  letting,  v3  =  5^7  and  with  appropriate  substitutions, 

Pb  =  1  ”  ve~v3 Io^2}J~v^dv.  (24) 

But  this  integral  possesses  the  form  of  an  Incomplete  Toronto  Function  [13,  pp-348]  which  is  defined  as  follows  : 

rB 

TB(m,n,r)  A  2rn-m+1e-rI  /  tm~n e~'' ln(2rt)dt  (25) 

—  Jo 

Hence  Pb  can  be  written  in  a  more  compact  form  as  : 

-  1  -  (26) 


II.2.b  :  Multiple  Samples  Case 

The  detector  performance  can  be  expected  to  improve  if  the  decisions  can  be  based  on  multiple  observations 
within  a  pulse.  The  question  would  then  be  how  to  combine  the  multiple  samples  in  order  to  come  up  with  an 
inference.  For  the  Envelope  Detection  case,  Tsui  and  Sharpin  have  recently  derived  an  M-out-of-iV  scheme  where 
the  presence  of  target  is  decided  if  at  least  M  samples  out  of  a  total  of  N  exceed  the  detection  threshold  [12]. 
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In  this  work  we  take  a  different  approach  where  decisions  are  taken  based  on  the  sum  of  N  squared  samples. 
This  approach  is  more  akin  to  traditional  CW  detection  schemes  where  integration  over  N  pulses  is  performed 
for  making  a  decision  [13]. 

Let  Y  be  the  random  variable  formed  with  the  sum  of  N  independent  squared  samples,  i.e., 

N 

YA'E'Z(k),  (27) 

4  =  1 

where,  the  PDF  and  CF  of  Z(k)  were  derived  in  II.2.a. 

II.2.b.l  :  The  PDF  and  Characteristic  Function  of  Y  with  No  Signal 

When  no  signal  is  present,  the  PDF  of  Y  which  is  formed  as  the  sum  of  N  independent  samples,  is  given  by 
the  following  convolution  : 


fY(y\a)  A  fz(zi\a)  *  /z(*a|«)  ★  ★  /z(z/v I*)  (28) 

where,  each  of  the  Z(k)'a  has  identical  distribution.  Direct  convolution  of  N  PDFs  appears  complicated,  but  it 
is  well-known  that  convolution  in  PDF-domain  implies  multiplication  in  the  CF-domain.  Consequently,  the  CF 
of  V  is  given  by, 

N 

ck w)  =  n  cs(«)  =  leu*)}" 

4=1 

Using  the  inverse  Fourier  TVansform  pair-431  [Campbell  and  Foster,  pp-44],  the  PDF  of  Y  is, 

/y(y|5)  =  (WUv~lj!u(y)  (30) 


11.2. b. 2  :  The  PDF  and  Characteristic  Function  of  Y  in  Presence  of  Signal 

Using  arguments  similar  to  those  in  the  previous  subsection,  when  signal  is  present,  the  CF  of  Y  is  given  by, 

N 

Q(W)  =  n  C'zW  =  [CZ(*)]N 

4=1 


1 


_ 

-c  i  + 


(1  +  j2u<r3)N 

Once  again,  using  the  inverse  Fourier  Transform  pair-650.0  [Campbell  and  Foster,  pp-77],  the  PDF  of  Y  is, 


/v(y|s)  = 


J _ y_ 

2<rJ  NA 2 


-*<NAa  + 


(y) 


(31) 


(32) 


II.2.b.3  :  The  Neyman-Pearson  Criterion  with  Multiple  Samples  : 

For  this  N-dimensional  case,  the  decision  that  the  signal  is  present  is  taken  if  the  likelihood-ratio  [17]  : 

tr  =  MM  >  i,PrA)  (J3) 
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For  a  given  predetermined  Pfa  >  the  threshold  can  be  set  by  integrating  /(y|S)  over  the  region  where  the  signal 
is  present  [11,  17]. 

II.2.b.4  :  Probability  of  False  Alarm 


For  7  denoting  the  threshold,  the  false-alarm  probability  is, 

pfa  =  f  fv(y  |7)dy 

1  e~^yN~t 

=  2 W-D<dy  frOm(30)’ 

"  2 <r*  1  {N-l)\  dy  (34a) 

-■-'(ww*-1)  {ut) 

where,  /(■)  denotes  Incomplete  Gamma  Function  which  is  defined  as, 

ruiATt  e-vvt 

I(u,t)  A  J  — , dv.  (35) 


11.2. b.5  :  Detection  Threshold 

For  a  given  Pfa,  the  threshold  7  can  be  determined  numerically  with  a  computer  or  using  available 
plots/ tables  [13]. 

11.2. b.6  :  Probability  of  Detection 

If  the  sum-of-squares  y  is  greater  than  the  threshold  7  determined  from  (35),  then  the  decision  is  taken  that 
source  target  is  present.  Hence  the  probability  of  detection  can  be  calculated  from  : 

po  -  f  fy(y\9)dy 

=  1  -  /  fr{y\s)dy 
Jo 


By  letting,  and  with  appropriate  substitutions, 


*  -  -  *  - 

This  integral  also  possesses  the  form  of  an  Incomplete  Toronto  Function  defined  in  (25).  Hence  P$  can  be  written 
in  a  more  compact  form  as  :  _ 


H.3  :  Envelope  Detector 
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The  PDF  of  the  envelope  ( R )  for  a  single  sample  was  found  in  (8).  Hence,  for  a  given  Pfa,  the  detection 
threshold  is, 


7  =  \/-2ff2  In  Pfa 

(39) 

The  calculation  of  threshold  with  N  observation  samples  can  be  shown  to  be  [12], 

1  =  tV"(M)  +  "S  *h"e’ 

(40a) 

T  is  found  approximately  from, 

Pfa  =  0.5(1  -  4~\T)) 

(406) 

and  4>(  )  denotes  the  error  function.  More  details  for  the  Envelope  Detector  case  can  be  found  in  [12]. 

It  may  be  noted  that  unlike  the  square  law  and  envelope  detection  threshold  calculations  for  conventional 
radars  [13],  the  discretized  schemes  presented  here  do  not  use  matched  filtered  output  but  use  the  sampled  data 
directly. 
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CHAPTER  3 


OPTIMAL  SYSTEM  IDENTIFICATION  PROBLEMS 


Introduction 

The  rational  System  Identification  theory  is  closely  related  to  the  receiver  design  problem.  In  particular, 
Angle-of-Arrival  (AOA)  and  frequency  estimation  are  two  of  the  most  important  integral  parts  of  moat  radar 
receivers  but  these  two  problems  can  be  addressed  as  special  cases  of  rational  system  identification  problems. 
Furthermore,  digital  EW  receivers  would  require  many  digital  filters  for  various  purposes  such  as,  anti-aliasing, 
image  suppression,  IF  and  etc..  Synthesis  of  digital  IIR  filters  from  any  arbitrary  frequency  domain  specifications 
is  also  one  of  the  important  problems  addressed  by  the  proposed  system  identification  framework.  Identification 
of  unknown  discrete-time  linear  systems  is  a  fundamental  problem  in  signal  processing.  Among  many  available 
parametric  models,  pole-zero  or  rational  transfer  function  model  is  one  of  the  most  effective  and  practical  rep¬ 
resentations.  Optimal  estimation  of  rational  model  parameters  will  be  the  focus  of  this  part  of  the  report.  The 
system  identification  and  signal  analysis  problems  considered  here  are  fundamental  in  nature  and  the  results  are 
expected  to  have  impact  and  usefulness  in  a  wide  range  of  applications  including  EW  receiver  design. 

Applications  of  System  Identification  abound  in  Communication  systems,  Automatic  Control  systems, 
Aerospace  and  Mechanical  Systems,  Econometrics  and  many  other  fields.  Digital  filter  design  from  frequency 
and/or  time-domain  information  has  extensive  applications  in  speech  or  image  processing,  communication,  radar 
or  sonar  signal  processing,  bio-medical  signal  processing,  Digital  Instrumentation  and  Control  and  in  various 
other  fields.  Depending  on  the  application,  the  design  specifications  of  an  unknown  system  may  be  available 
or  prescribed  in  the  time-domain  (T-D)  as,  (i)  Impulse  Response  (IR)  or  (ii)  Input-Output  (10)  data,  and  in 
the  frequency-domain  (F-D)  as  (iii)  FVequency  Response  (FR)  data.  The  standard  synthesis  or  identification 
problem  is  to  estimate  the  numerator  and  denominator  polynomial  coefficients  that  match  the  prescribed  specs 
in  the  least-squares  (LS)  sense.  It  is  well-known  that  these  LS  problems  are  highly  non-linear.  Some  existing 
approaches  minimize  ‘equation  errors’  instead  of  the  true  fitting  errors  and  others  modify  or  linearize  the  true 
model-fitting  criteria  for  iterative  estimation  of  the  numerator  and  denominators  simultaneously. 

The  main  goal  in  this  part  of  the  work  is  to  exploit  certain  powerful  theoretical  results  in  Numerical  Analysis 
to  theoretically  decouple  the  multidimensional  nonlinear  criteria,  into  two  distinct  problems  :  (1)  a  purely  linear 
problem  for  estimating  the  numerator  and  (2)  a  non-linear  problem  for  estimating  the  denominator.  The  nonlinear 
part  is  then  reparameterized  by  invoking  results  on  projection  operators.  In  this  form,  the  denominator  criterion 
possesses  a  weighted  matrix  structure  which  is  convenient  for  iterative  optimization.  But  more  importantly, 
once  the  optimal  denominator  is  known,  the  optimal  numerator  is  found  with  only  a  single  step  of  linear  LS 
estimation.  Removal  of  the  numerator  estimation  from  the  iterative  process  reduces  computational  complexity 
when  compared  with  existing  simultaneous  estimators  in. 

The  theoretical  results  as  well  as  the  algorithmic  framework  we  propose  here  encompass  a  comprehensive 
class  of  system  identification  problems  in  time  and  frequency  domains.  This  important  underlying  common  theme 
appears  to  have  remained  unrecognized  and  unutilized.  In  fact,  one  of  our  goal  is  to  establish  the  analogies  and 
equivalences  between  the  time-domain  and  frequency-domain  optimization  approaches  which  seem  to  have  evolved 
independently.  Our  hope  is  that  a  thorough  study  and  proper  understanding  of  these  equivalences  might  enable 
us  to  apply  and  exchange  useful  ideas  from  one  domain  to  the  other.  It  may  also  lead  to  combined  optimization 
in  the  frequency  and  time  domains  by  matching  the  desired  characteristics  in  both  domains  simultaneously. 

The  proposed  unified  framework  is  expected  to  provide  intuitive  and  useful  theoretical  insights  into  various 
time-domain  and  frequency-domain  identification  and  synthesis  problems.  For  example,  the  1-D  SISO  algorithms 
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can  be  extended  to  multi-dimensional  (m-D)  and  multi-input/multi-output  (M1M0)  problems  in  a  straight¬ 
forward  manner. 
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Section  -  3.1  :  IDENTIFICATION  OF  1-D  RATIONAL  SYSTEMS  FROM  INPUT-OUTPUT  DATA 


Summary 

A  theoretical  and  algorithmic  framework  is  proposed  for  optimal  identification  of  rational  transfer  function 
parameters  of  discrete-time  linear  systems  from  Input-Output  (10)  data.  The  nonlinear  criterion  is  theoretically 
decoupled  into  a  purely  linear  problem  for  estimating  the  optimal  numerator  and  a  nonlinear  problem  for  the 
optimal  denominator.  The  proposed  decoupled  approach  has  reduced  computational  requirements  when  compared 
to  existing  algorithms  which  estimate  the  parameters  simultaneously. 

I.  INTRODUCTION  :  Identification  of  unknown  Linear  Time-Invariant  Discrete-Time  systems  is  a  critical  prob¬ 
lem  in  signal  processing  and  control  theory  [1-13, 15-22}.  This  work  addresses  the  problem  of  optimal  identification 
of  the  parameters  of  rational  transfer  functions  by  Least-Squares  (LS)  fitting  of  observed  input-output  sequences. 
Optimization  of  the  LS  criterion  for  this  problem  requires  multi-dimensional  nonlinear  optimization  [1,  2,  15-21]. 
Many  existing  algorithms  either  modtfy  or  /income  the  true  nonlinear  error  criterion  to  estimate  the  unknown 
parameters  simultaneously.  This  work  will  demonstrate  that  the  optimal  rational  model  identification  problem 
belongs  to  a  special  class  of  mixed-nonlinear  optimization  framework  where  the  linear  and  nonlinear  variables 
separate  [14],  The  true  nonlinear  criterion  will  be  theoretically  decoupled  into  : 

(i)  a  purely  linear  problem  for  obtaining  the  optimal  numerators  and 

(ii)  a  nonlinear  problem  of  reduced  dimensionality  for  determining  the  optimal  denominators. 

The  decoupled  criteria  retain  the  global  optima  of  the  original  criterion.  Only  the  criterion  for  the  denominator 
is  nonlinear  but  it  possesses  a  weighted-matrix  structure  which  is  utilized  for  minimizing  it  iteratively.  The 
optimal  numerator  is  estimated  in  one  step.  Hence,  unlike  some  existing  algorithms  which  estimate  both  sets  of 
parameters  iteratively  [2],  the  proposed  computational  algorithm  has  reduced  computational  requirements. 

II.  PROBLEM  Formulation  :  Rational  transfer  function  representations  of  a  SISO  plant, 


h(2\  -  oW-M1)*'1  +  "  +a(g)z~*  A(z) 

K  ’  ~  1  +  6(l)x-i  +  •  •  •  +  b(p)z~P  =  B(z) ' 

=  h( 0)  +  +  •  •  •  +  h{N  -  !)*-<*-*>  +  •  -  • , 


(1) 

(2) 


where,  6(0)  =  1.  Fig.  1  depicts  what  is  commonly  known  as  the  output-error  model  of  a  plant,  where,  y0(n)  and 
y(n)  denote  the  true  and  observed  (possibly  noisy)  output  signals,  respectively,  and  v(n)  denotes  the  observation 
or  measurement  noise.  Let, 

x  A  {x(0)  *(1)  •••  x(JV-l)]T  (3a) 

and 

yA[y(0)  y(l)  ---  y(N  -  1)]T  (36) 

denote  the  vectors  containing  the  N  input  and  observed  samples,  respectively.  In  vector  form,  the  unknown  model 
parameters  are  defined  as, 

a  A  [a(0)  a(l)  •••  a(?)]T  (4a) 

and 

b  A  [1  6(1)  6(p)]T.  (46) 

The  problem  under  consideration  in  this  part  of  the  project  can  be  stated  as  follows  : 


Given  the  observed  output  data  y  and  the  input  data  x,  estimate  the  optima/  model  parameters  a  and  b  by 
minimising  the  following  LS  model-fitting  criterion  : 

<*(.)>]’  (5) 

■=0  '  ' 

Regarding  methods  related  to  this  work,  Kalman  [1]  had  defined  an  equation  error  to  solve  this  problem  (KM), 
whereas  Steiglitz  and  McBride  (SMM)  iteratively  minimized  a  modified  error  criterion  (2]  to  estimate  both  sets 
of  parameters  simultaneously. 

III.  PROPOSED  Method  (OM-IO)  :  Let  //*(z)  be  the  inverse  filter  corresponding  to  B(z),  i.e., 

B(z)ft(z)  =  1.  (6) 

This  is  a  convolution  operation  and  hence,  in  matrix  notations, 

B»H»  =  IN,  (7) 


where,  Ijv  denotes  an  N  x  N  identity  matrix;  B*  and  H*  are  convolution  matrices  formed  as, 

A  b(i-j),  (8a) 

and 


H»(i,j)  A  for,  ;  =  1 . N 

Note  that  both  these  matrices  are  lower-triangular.  In  partitioned  form, 


B»  = 


H»  =  [Hj|Hr]. 


(86) 


(9) 


where,  B„  G  IRJVx(,+1),  B  €  H,  €  IRW)<^+1>  and  Hr  G  .  Using  (6)  and  assuming 

that  the  input  is  causal,  i.e.,  x(n)  =  0,  for  n  <  0,  the  optimization  criterion  in  (5)  can  be  restated  as, 


N-l  2 

Tln  £  [»(*)  -  *(0  *  MO  *  «(»)]  . 

m,D  »=o  1 


where,  *  denotes  the  convolution  operation.  In  matrix  notations  the  problem  is  equivalent  to 

min||e(a,b)||3  A  min||y  -  XHia||2,  where, 

a,b  =  a,b 

X(«,j)  A  x(i-j)  for,  i,j  =  1 


(10) 

(11a) 

(116) 


This  is  a  mixed  optimization  problem  where  the  linear  and  nonlinear  variables  appear  separately.  If  H;  (i.e.,  b) 
is  known,  then  the  linear  LS  estimate  of  the  numerator, 


k  A  (XH/)#y,  (12) 

where,  (XHj)*  A  ((XH{)t(XH»))-1(XHj)t  denotes  the  pseudo-inverse  of  (XH;)-  Plugging  a  back  in  (11a), 
the  optimization  criterion  for  b  is  given  by, 

min||e(b)||2  A  min||y  -  PXH,y||2  =  min||[IN  -  PxH,]y||2  (13) 

b  =  b  b 


67 


where,  PxHi  A  XH/((XH/)T(XH/))_1(XH/)T,  denotes  the  projection  matrix  of  (XH/).  In  (13),  the  parameters 
in  b  are  indirectly  related  to  the  error  criterion  in  a  rather  complicated  manner  through  PxHi  Next,  the  inherent 
matrix-structure  of  the  criterion  in  (13)  is  utilized  to  reparameterize  the  error  criterion  by  relating  it  directly  to 
the  coefficients  in  b. 

Let  X[(z)  be  the  inverse  of  the  input  sequence  X(z),  i.e.,  X(z)Xi(z)  =  1.  Similar  to  (6)  and  (7),  in  matrix 
notation, 

X/X  =  Ijv,  (14a) 

where,  X/  €  JRNxN  is  also  a  lower  triangular  matrix  defined  as, 


X/  A  x i(i-j),  for,  i,j  =  1  ,...,N. 


(146) 


For  finite  N ,  this  inverse  exists  as  long  as  the  first  element  of  the  input  sequence  is  non-zero,  i.e.,  x(0)  ^  0.  This 
is  not  a  major  restriction  for  the  causal  systems  under  consideration  in  this  work  because  the  output  will  have 
non-zero  leading  values  only  when  there  is  non-zero  input.  But  it  would  be  desirable  that  X(z)  be  minimum- 
phase,  otherwise  Xj(z)  may  be  unbounded  for  some  values  of  z  which  in  turn  may  result  in  very  high  magnitudes 
of  z/(n)  for  large  N.  Combining  (7)  and  (14)  and  using  the  partitioned  forms  of  (9), 


BtX/XHt  =  IN 


X/X[H,|Hr] 


(15a) 


BjX/XH/ 

|  BjXyXHr 

^r+i) 

1  0(,+l)x(jv-f-1) 

1 

btx,xh, 

I  _ 

|  BTX/XHr 

0(/V— f—  l)x(,+l) 

I 

1  I(/V-,-l)x(/V-,-l). 

The  bottom-left  corner  element  shows  that  the  N  x(N  —  q—  1)  matrix  Xj B  and  the  N  x(q  +  l)  matrix  XH/  are 
orthogonal,  i.e.,  (BTX/)(XH,)  =  0(W_,_1)x(,+1).  By  construction,  rank(XjB)  +  rank(X H,)  =  N.  Hence, 
according  to  a  property  of  projection  matrices, 


Px^B  +  PXH,  =  I*. 

Using  this  result  in  (13),  the  following  reparameirized  optimization  criterion  is  obtained, 

min  ||PxTBy||2  =  min  ||X]’B(BTX/X]’B)-1BTX/y||2 

o  *  l> 

=  min  yTXjB(BTX/XjB)-1BTX/y. 

b 

In  order  to  obtain  an  expression  more  convenient  for  optimization,  define, 


z  A  X/y. 


It  can  be  easily  shown  that, 


Btz  A  Zb, 


where,  the  matrix  Z  is  constructed  with  the  elements  of  z  as, 


'  *(ff+  1) 

*(?) 

•  ■  •  z{ 0)  0 

0 

z{q  +  2) 

z{q  + 1) 

z(l)  z(0) 

0 

Z  A 

^(P) 

z(p-  1) 

z(0) 

.z(iV-l) 

z(N  -  2) 

. 

•••  z(N-p-  1) 

(16) 

(17a) 

(176) 

(18) 

(19a) 

(196) 
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(20) 


Using  (19a)  into  (17),  the  optimization  criterion  can  be  re-written  as, 

min  bTZT(BTX/Xf B)~ 1  Zb. 

b 

Equation  (12)  and  the  reparametrized  criterion  in  (20)  are  the  final  decoupled  forms.  It  should  be  emphasized 
here  that,  thus  far,  the  theoretical  derivations  are  mathematically  exact,  i.e.,  no  linearization,  approximation  or 
modification  have  been  introduced  at  the  outset. 

According  to  the  Theorem  stated  in  the  Appendix  [14],  if  b  is  estimated  by  minimizing  the  criterion  in  (20) 
and  if  that  estimate  is  utilized  for  computing  a  using  (12),  then  the  resulting  estimates  are  the  nnigue  and  global 
minimizers  of  the  original  criterion  in  (5)  or  (11).  Furthermore,  once  the  optimal  b  is  known,  the  estimation  of 
the  optimal  a  in  (12)  is  a  linear  problem.  But  more  importantly,  a  needs  be  computed  only  once. 

Algorithm  :  The  nonlinear  criterion  in  (20)  appears  to  be  a  weighted  quadratic  in  the  unknown  vector  b. 
But  the  weight  matrix  ( BTXiXjB)~ 1  itself  is  dependent  on  the  unknowns  in  B.  The  computational  algorithm 
exploits  this  weighted  quadratic  structure  of  the  criterion.  At  fc-th  iteration,  the  algorithm  minimizes, 

min  bT[ZT(BT(*"l)X/XfB(t-1))-1Z]b,  (21) 

b 

where,  B^*-1)  is  formed  by  using  the  estimate  of  b  obtained  at  the  previous  iteration.  b<0^  A  [1  0  •  •  0]T  can 
be  used  as  the  initial  estimate  of  b  to  start  the  iterative  process.  Otherwise,  the  initial  estimates  could  also  be 
found  by  setting  the  middle  matrix  (BTX/XjB)-1  to  identity,  i.e.,  by  optimizing, 

min  bTZTZb.  (22) 

To  ensure  non-trivial  solutions,  6(0)  is  set  to  unity.  Once  the  iterations  converge,  the  estimated  b  is  used  in  (12) 
to  linearly  estimate  the  numerator  coefficient  vector  a. 

On  the  Relationships  with  Other  Methods  :  The  proposed  theoretical  and  algorithmic  framework  appears 
to  be  the  most  general  one  in  its  own  class  of  1-D  deterministic  rational  System  Identification  (SID)  algorithms. 
In  fact,  a  large  body  of  work  on  SID  can  be  formulated  as  special  cases  of  OM-IO.  For  example,  in  case  of  Impulse 
Response  (IR)  fitting,  i.e.,  when  x(n)  =  <(n)  and  y(n)  A  hrf(n),  the  desired  IR,  an  optimal  method  (OM)  has 
been  developed  recently  [8,9].  The  work  in  this  part  of  the  project  may  be  considered  to  be  a  further  generalization 
OM.  The  Evans-Fischl  Method  (EFM)  [5]  was  an  early  precursor  of  OM.  But  EFM  dealt  only  with  the  IR  fitting 
problem  and  it  is  applicable  only  for  the  strictly-proper  case,  i.e.,  when,  p  =  q  +  1.  Furthermore,  the  recently 
proposed  Maximum- Likelihood  Method  for  exponential  modeling  (known  as,  KiSS  or  IQML)  is  basically  a  complex 
version  of  EFM  with  conjugate-symmetry  constraints  imposed  on  the  B(z)  coefficients  [6,  7].  Hence,  KiSS/IQML 
is  also  an  important  special  case  of  OM-IO.  Furthermore,  when  p  =  g  +  1,  the  initialization  step  of  OM  is  identical 
to  Prony’s  Method  [10]  or  Covariance  Method  of  Linear  Prediction  [11,  13].  For  general  cases,  Shanks  [3]  and 
Burrus-Parks  [4]  also  estimated  the  denominator  using  the  initialization  step  of  OM.  For  numerator,  the  linear 
estimator  in  (12)  was  used  by  Shanks  whereas  Burrus-Parks  used,  a(k)  =  53,._06(i)hi(fc  -  i),  for  k  =  0, 1, . .  .,q. 
Finally,  the  formulation  presented  in  here  appears  to  be  quite  well-suited  for  deconvolution  [22].  Specifically,  if 
the  output  and  the  Channel  IR  (or,  alternately,  the  estimates  of  a  and  b)  are  available,  then  the  criterion  in  (11) 
can  be  appropriately  modified  to  obtain  an  LS  or  MLE  of  the  unknown  input  vector  x. 

Section  IV  :  SIMULATION  RESULTS  :  In  ail  figures,  the  true  and  modeled  impulse  responses  are  shown  in 
solid  and  dotted  lines,  respectively. 

Simulation  I  :  In  this  case,  white  noise  was  passed  through  an  ARMA(7,3)  system  with  an  arbitrary  impulse 
response.  The  output  was  corrupted  with  uncorrelated  white  noise.  The  first  N  =  30  input  and  output  samples 
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were  collected  for  identifying  the  system.  True  model  orders  were  used  for  identifying  the  system.  The  results 
with  30dB  and  15dB  SNR  values  are  shown  in  Fig.  2A  and  2B,  respectively. 

Simulation  2  -  Model  Reduction  :  For  the  same  data  sets  of  Simulation- 1  an  ARMA(5,3)  model  was  used  for 
identifying  the  system.  Note  that  the  denominator  order  is  less  than  the  true  order  in  this  case.  The  results  with 
30dB  and  15dB  SNR  values  are  shown  in  Fig.  3A  and  3B,  respectively. 

The  results  of  Simulation- 1  indicate  that  the  proposed  algorithm  is  able  to  match  the  unknown  model  impulse 
response  very  closely  by  minimizing  the  output  error  norm.  Simulation-2  demonstrates  that  the  algorithm  also 
has  the  capability  of  obtaining  reduced  order  models  with  good  fit. 

Number  of  Iterations  for  Convergence  and  CPU  Times  :  For  30dB  SNR,  the  number  of  iterations  for 
convergence  for  actual  and  reduced  order  cases  were  found  to  be  8  and  5,  respectively.  The  iterations  were 
terminated  in  both  cases  when  ||bj+i  —  b<||2  <  10~3  was  achieved  in  each  case.  The  corresponding  CPU  times 
on  VAX-8550  were  3.0  and  2.59  seconds,  respectively.  Similar  differences  in  performance  were  found  for  other 
SNR  values  also.  In  general,  the  algorithm  showed  rapid  convergence  in  all  simulations  performed.  But  if  the 
unknown  system  is  non-minimum  phase  or  if  the  SNR  in  the  output  data  is  too  low,  the  algorithm  may  converge 
to  a  suboptimum  or  the  estimates  may  oscillate.  In  order  to  guarantee  convergence,  the  proposed  iterative 
transformation  must  be  a  contraction  mapping.  This  may  be  difficult  to  demonstrate  in  general  for  any  arbitrary 
input-output  data  set.  Theoretical  analysis  of  the  convergence  properties  of  the  iterative  algorithm  needs  to  be 
performed. 
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APPENDIX  :  Optimality  Properties  of  the  Separate  Estimators 

THEOREM  -  (Adapted  from  Theorem  2.1  in  [14])  :  If  b  is  a  global  minimizer  of  ||e(b)||3  in  (13)  and  a  is 
estimated  using  that  b  as  in  (12),  i.e., 

a  A  (XH»)#y,  (A.l) 

where,  H/  is  formed  using  b,  then  ||e(a,b)||3  is  a  global  minimiser  of  ||e(a,b)||3  and  ||e(a,b)||3  =  ||e(b)||3. 
Conversely,  if  (a,  b)  is  a  global  minimizer  of  ||e(a,  b)||3,  then  b  is  a  global  minimizer  of  ||e(b)||3  and 
||e(b)||3  =  ||e(a,b)||3.  Finally,  if  there  is  an  unique  a  among  all  possible  minimizing  pairs  of  ||e(a,b)||3,  then  a 
must  satisfy  (A.l). 

PROOF  :  See  [14], 
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Fig.  1.  The  Output  Error  Model  Structure 
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Fig.  2:  Estimated  Impulse  Response  with  output  SNR  values  of  (A)  30dB  and  (B)  15dB.  lYue  model- 
order  :  ARMA(7,3)  and  assumed  model-order  :  ARMA(7,3). 


Fig.  3:  Estimated  Impulse  Response  with  output  S  es  of  (A)  30dB  and  (B)  15dB.  True  model- 

order  :  ARMA(7,3)  and  assumed  model-ordt  „&MA(5,3). 


Section  -  3.2  :  IDENTIFICATION  OP  1-D  RATIONAL  SYSTEMS  IN  THE  FREQUENCY  DOMAIN 


SUMMARY 

A  new  Frequency-Domain  (FD)  approach  is  presented  for  optima!  estimation  of  rational  transfer  functions 
coefficients.  The  proposed  method  seeks  to  match  any  arbitrarily-shaped  FD  specifications  in  the  Least-Squares 
(LS)  sense.  The  desired  specifications  may  be  arbitrarily  spaced  in  frequency.  The  design  is  performed  directly 
in  the  digital  domain  and  no  analog  to  digital  transformation  is  necessary.  The  proposed  method  makes  use  of 
the  inherent  mathematical  structure  in  this  rational  modeling  problem  to  theoretically  decouple  the  numerator 
and  denominator  estimation  problems  into  two  smaller  dimensional  problems.  The  denominator  criterion  is 
nonlinear  but  possesses  a  weighted-quadratic  structure  which  is  convenient  for  iterative  optimization.  The  optimal 
numerator  is  found  linearly  by  solving  a  set  of  simultaneous  equations.  The  decoupled  criteria  retain  the  global 
optimality  properties.  The  performance  of  the  algorithm  is  demonstrated  with  some  simulation  examples. 

I  :  Introduction 

Traditionally,  digital  filters  are  designed  by  performing  Impulse  Invariance  or  Bilinear  transformation  on  available 
analog  designs.  Classical  analog  designs  utilise  polynomial  approximations  to  match  standard  filter  shapes  such 
as,  Low-Pass,  High- Pass  etc.  [9,  10].  An  obvious  drawback  of  classical  analog  design  techniques  is  that  filters 
with  arbitrary  or  non-classical  specifications,  as  in  case  of  a  notch  filter,  can  not  be  obtained.  In  this  part  of 
the  report,  a  direct  method  for  frequency-domain  design  of  digital  HR  filters  is  proposed.  The  method  seeks  to 
match  a  desired  frequency  response  with  any  arbitrary  shape  by  minimizing  the  optimal  LS  fitting  error  criterion. 
The  LS  criterion  for  this  problem  involves  multi-dimensional  nonlinear  search  and  several  linearized  or  modified 
approaches  have  been  developed  [2,  3,  21,  31].  There  have  been  some  ad-hoc  attempts  on  designing  digital  filters 
with  special  shapes  [9,  12].  Frequency  domain  version  of  Prony’s  algorithm  has  also  been  presented  recently 
[14,  15,  19,  25].  But  it  appears  that  the  underlying  mathematical  structure  inherent  in  this  rational  modeling 
problem  have  not  been  fully  exploited.  In  this  work,  the  frequency-domain  least-squares  problem  is  formulated  by 
identifying  the  orthogonal  proi^tion  space  which  is  shown  to  be  formed  entirely  by  the  denominator  parameters. 
The  optimal  denominator  ated  by  minimising  the  exact  projection  space  which  is  independent  of  the 

numerator  coefficients.  The  al  numerator  estimation  problem  turns  out  to  be  a  simple  linear  LS  problem. 

It  is  demonstrated  in  this  work  that  the  optimal  rational  identification  problem  in  the  frequency-domain 
belongs  to  a  special  class  of  mixed-nonlinear  optimization  framework  where  the  linear  and  nonlinear  variables 
separate  [13].  It  is  further  shown  that  the  true  nonlinear  criterion  can  be  decoupled  into  : 

(i)  a  purely  linear  problem  for  obtaining  the  optimal  numerator  coefficients  and 

(ii)  a  nonlinear  problem  of  reduced  dimensionality  for  determining  the  optimal  denominator  coefficients. 

This  important  underlying  theoretical  and  algorithmic  aspects  of  designing  digital  filters  in  frequency-domain, 
appears  to  have  remained  mostly  unutilized.  After  decoupling,  the  denominator  criterion  possesses  a  convenient 
weighted-matrix  structure  which  is  then  utilized  to  develop  an  iterative  minimization  algorithm.  Once  the  de¬ 
nominator  is  estimated,  the  optimal  numerator  is  found  only  once  with  linear  LS.  The  decoupled  criteria  retain 
the  global  optima  of  the  original  criterion.  The  proposed  approach  is  closely  related  to  some  time-domain  results 
developed  recently  by  the  present  author  [8,  16,  17].  The  design  methodology  described  here  will  be  based  on 
matching  desired  Discrete-Time- Fourier-TVansform  (DTFT)  values  which  may  be  arbitrarily  spaced  in  frequency. 
But  the  algorithm  can  be  easily  modified  if  the  desired  specifications  are  available  in  the  form  of  DFT  values. 

The  Section  is  arranged  as  follows  :  In  Subsection  II,  the  rational  transfer  model  is  defined  and  the  frequency- 
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domain  identification  problem  is  stated.  In  Subsection  III,  some  existing  methods  addressing  this  problem  are 
briefly  outlined.  The  details  of  the  proposed  decoupled  solution  is  presented  in  Subsection  IV.  Some  simulation 
examples  are  given  in  Subsection  V  to  demonstrate  the  performance  of  the  proposed  approach. 

II  :  The  Rational  Transfer  Function  Model  and  The  Frequency- domain  Design  Problem 


An  ARMA(p,  q)  digital  filter  can  be  modeled  as  : 


H(z)  =  £>(»)*- 

i=0 


a(0)  +  a(l)r~1  +  •  •  •  +  o(g)r~*  .  AT(r) 
l+6(l)r"1+-+6(p)r-P  =  D(z) 


(0 


Let, 


h  A  [h( 0)  Ml)  HN-  1)]T, 


be  the  vector  with  the  first  N  significant  samples  of  H ( z )  and 


a  A  [do  at  •  •  •  af]T  and 
b  A  (1  61  ■  •  •  6P]T 


(2a) 

(24) 

(2c) 


be  the  numerator  and  denominator  coefficient  vectors,  respectively. 

Let  Hd(z)  represent  the  dtstrti  IIR  filter  which  needs  to  be  modeled  as  H(z)  in  (1).  Using  the  notations  of 
equation  (1),  let  Hd(uk),  N(u> *)  and  D(w*)  be  defined  as  the  frequency  response  values  of  Hd(z),  N(z)  and  D(z), 
respectively,  at  z  =  e*** .  The  frequency-domain  identification  problem  can  be  stated  as  follows  : 

Given,  at  k  =  0, 1,2, ....  N  —  1,  the  desired  frequency  response  values  (possibly  arbitrarily  spaced), 

estimate  the  parameters  in  7V(w*)  and  D(w*)  by  optimizing  the  following  LS  error  criterion  : 


minlM2  A  min  £ 

’  ’  i=0 


N(u,,)  ^ 
Dim)  • 


(3) 


III  :  Some  Existing  Frequency-Domain  Direct  Design  Methods 

The  problem  stated  in  (3)  is  a  nonlinear  optimization  problem  and  standard  nonlinear  optimization  schemes  can 
be  used  [7,  II].  But  these  generic  algorithms  are  known  to  be  sensitive  to  initial  choice  of  estimates  and  they 
do  not  specifically  make  use  of  the  unique  mathematical  structures  inherent  in  this  problem.  Some  linearized 
methods  that  specifically  address  the  design  problem  stated  in  (3),  have  also  been  proposed  [2,  3].  More  recently, 
a  decoupled  algorithm  that  utilizes  divided-differences  and  Newton-Raphson,  has  been  reported  in  [14,  28].  In 
order  to  motivate  the  proposed  algorithmic  framework,  brief  outlines  of  some  of  the  direct  FD  design  methods 
are  given  next. 

III.l  :  Levy’s  Method  (LM) 

The  following  criterion  was  proposed  by  Levy  [2]  as  a  frequency-domain  counterpart  of  Kalman’s  original  work 
in  the  time-domain  [I]  : 

N-l  2 

min||ex,M||2  A  min  |D(w<)/fd(w,)  -  lV(wi)|  .  (4) 

’  ’  1=0 

Note  that  the  original  error  criterion  in  (3)  is  modified  in  Levy’s  case.  Apart  from  the  obvious  advantage  of 
single-step  linear  solution,  this  algorithm  does  not  possess  any  other  optimality  properties.  It  may  also  be  noted 
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that  Kalman/Levy-type  approaches  for  the  ARMA  problem  are  closely  related  to  Levinson’s  work  on  the  all-pole 
problem  [18],  where  only  the  first  term  of  the  error  criterion  was  minimized.  The  AR  parameter  estimation  work 
is  further  related  to  Prony’s  method  [19]  and  Pade  Approximation  [20].  Similar  error  criterions  for  the  ARMA 
problem  have  been  later  rediscovered  [21]  and  analyzed  [22]. 

III. 2  :  Sanathanan-Koerner’s  Prefiltering  Method  (SKM) 

The  earliest  work  that  most  closely  approximates  the  true  LS  fitting-error  criterion,  appears  to  be  due  to 
Sanathanan  and  Koerner  [3].  Their  goal  was  to  improve  upon  Levy’s  work  which  did  not  really  attempt  to 
optimize  the  true  criterion  in  (3).  In  this  case,  an  initial  estimate  of  the  denominator  coefficients,  Efi°\uo)  is  first 
obtained  by  minimizing  Levy’s  criterion  in  (4)  and  then  the  following  modified  fitting  error  criterion  is  optimised 
at  the  Jfc-th  iteration  [3], 


•  IU  II*  A  min  V'  I  _  ffM  |2 

min||esK||  A  mi m  ^  |  £K*-1)(Wj)|  ’ 


(5) 


where,  Lfik~l\ui)  denotes  the  denominator  estimate  at  the  previous  iteration  which  is  used  as  a  prefilter  for 
obtaining  the  estimates  at  the  following  iteration  step.  Note  that,  (5)  closely  approximates  (3)  and  both  are 
identical  if,  D{u>i)  =  Lfik~l\ui).  But  using  (5),  the  unknown  parameters  in  a  and  b  can  be  estimated  simultane¬ 
ously  by  solving  a  set  of  linear  equations.  A  time-domain  counterpart  of  Sanathanan-Koerner’s  method  was  later 
discovered  independently  by  Steiglitz  and  McBride  in  [4],  though  the  later  work  is  definitely  more  well-recognized 
in  Signal  Processing  and  System  Identification  literature  [9,  10,  23,  24]. 

m.3  :  Kum  are  sen’s  Decoupled  Method  -  Generalized  (KM-G) 

The  Frequency- Domain  error  criterion  in  (3)  has  been  recently  decoupled  by  Kumaresan  in  [14,  15,  25,  28],  where 
divided-difference  matrices  [26]  have  been  utilized.  Similar  to  a  time-domain  decoupled  algorithm  due  to  Evans 
and  Fischl  (EFM)  [6],  this  approach  was  originally  proposed  for  strictly-proper  cases,  i.e.,  when,  p  =  q  +  1. 
In  the  brief  outline  given  below,  appropriate  modifications  have  been  introduced  in  order  to  generalize  KM  for 
any  arbitrary  numerator  and  denominator  orders.  For  q-th  order  numerator  and  p-th  order  denominator,  the 
decoupled  criterion  for  estimating  the  optimal  denominator  is  : 


minh^"c"(CC")-1Ch3 

b 


where, 


h£  A  [Hd(u 0)  Hd( u/i)  -Mw/v.Or 

denotes  the  vector  containing  the  N  samples  of  the  prescribed  frequency  response  data, 
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Defining  f  A  UDhJf  e  m.(N+p-'-1)xN ,  the  error  criterion  can  be  written  in  the  following  weighted-quadratic 
form  : 

min  bTFH(CCH  )~ 1  Fb,  (8) 

b 

where,  F  €  JRN_,_lxp+1  is  formed  using  the  elements  of  f  as  follows  : 


/(P) 

/(P+D 


/(P-1) 

/(P) 


lf(N+p-q-2 )  f(N+p-q- 3)  •••  f(N-q- 2)J 

The  optimal  denominator  coefficients  are  obtained  using  an  iterative  algorithm.  Once  the  optimal  denominator 
is  available,  the  numerator  is  estimated  as  : 

a  =  (D»Uf+1)#h2,  (10) 

where,  *  denotes  the  pseudo- inverse  and 

0  •  0  1 


D»  A 


b(wi)  ■ ■ 


€K"xJV  and 


D(un-i)  . 


'1  e*u°  e*3*0  •••  e**"0  " 

1  i  ei^>  ...  eiwt 

Uf+i  A  :  6  1RVVx,+1.  (12) 

1  eiu"-i  gj  Jww_,  ...  gjfwn-i 

It  can  be  easily  verified  that  for  the  special  case  of  p  —  q  +  1,  the  general  criteria  given  here  will  be  exactly  same 
as  the  one  given  in  [14,  28].  It  may  be  emphasized  here  that  the  frequency-domain  LS  algorithms  in  [2,  3]  are, 

(i)  Approximations  or  modifications  of  the  original  criterion  in  (3),  and 

(ii)  (p  +  <j)-dimensional  nonlinear  optimization  problems  for  estimating  a  and  b  simultaneously. 

In  contrast,  the  decoupled  method  (KM-G)  estimate  a  and  b  separately.  But  simulation  experiments  indicate  that 
the  desired  minimum  of  the  criterion  in  (8)  may  not  be  achieved  with  only  an  Evans-Fischl  type  LS  minimization 
of  (8).  Instead,  a  further  step  of  Newton-Raphson  had  to  be  incorporated  in  the  algorithm  in  order  to  achieve  the 
desired  optimum  [14].  Unlike  KM-G,  the  optimally  decoupled  method  developed  in  this  work  reaches  the  desired 
optimum  criterion  more  directly  and  without  using  Newton-Raphson. 

It  may  be  also  noted  that  Signal  Processing  Toolbox  of  the  widely  popular  MATLAB  software  package 
provides  a  direct  frequency-domain  design  macro  called  yulsvalk,  which  basically  implements  a  modified  Yule- 
Walker  method  developed  by  Friedlander  and  Porat  [31].  This  method  does  not  attempt  to  minimize  the  true 
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criterion  in  (3).  Instead,  it  attempts  to  fit  the  deterministic  correlation  values  to  obtain  the  rational  model 
parameters  by  essentially  minimizing  an  equation  error.  The  simulation  section  includes  some  comparison  of  the 
performance  of  the  proposed  method  with  this  approach. 

IV  :  Proposed  Method  (OM-DTFT) 

For  time-domain  rational  model  identification  problems,  a  new  framework  has  been  recently  presented  for  decou¬ 
pling  the  denominator  and  numerator  problems  into  two  separate  but  lower-dimensional  optimization  problems 
[8,  16,  17].  In  this  Section  it  is  shown  that  the  nonlinear  frequency-domain  criterion  of  (3)  can  also  be  decoupled 
in  a  similar  fashion. 

Let  Hk(z)  be  the  inverse  filter  corresponding  to  D(z),  i.e., 

£>(*)//»(r)  =  1.  (13c) 

Clearly,  this  is  a  convolution  operation  in  time-domain  and  it  can  be  expressed  using  matrix  notation  as, 


DH»  =  IN, 


(136) 


where,  I/v  denotes  an  IV  x  N  identity  matrix;  D  €  TRN*N  and  Hi  G  HtNxN  are  defined  below  in  appropriate 
partitioned  forms  which  will  be  useful  in  the  algorithm  : 
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(146) 


where,  Bu  G  HtJVx(’+1),  B  G  H,  G  m"x(,+l)  and  Hr  G  If  the  vector  h,  defined  in 

(2a),  represents  the  finite  length  impulse  response  vector  containing  N  significant  Impulse  Response  values,  the 
frequency  response  at  any  frequency  will  be  given  as, 


A  H(z) |j=e,„, 


n=0 


(15) 
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Stacking  the  model  frequency  response  values  at  all  the  N  specified  frequencies,  u>o,  wi,  . . . ,  u>n-u  the  model 
frequency-domain  vector  can  be  expressed  as  : 


hw  A 


ff(e>“  •) 

.//(*'“"-»)- 
[  1  e->— 

1  e~iu> 


1  1 


e-)(N-l)u>o  -I 


g-HN-l)u>N-i 


(16a) 


(1M) 


A  Wh. 


(16c) 


By  definition, 

H(z)  A  =  H»(z)JV(z),  using  (13a),  (17) 

where,  the  right-hand-side  represents  convolution  of  the  numerator  coefficients  with  the  inverse  sequence,  h»(n) 
(corresponding  to  Hi(z)).  Hence,  it  can  be  shown  that  the  model  impulse  response  vector  h  can  be  expressed  as, 


h  =  H(a. 

Using  this  in  (16), 

K  A  WH|« 

With  these  definitions,  the  frequency-domain  filter  design  problem  in  (3)  can  be  restated  as, 

mm||e||2  A  min||h*  -  WHja||J. 


(18) 

(19) 

(20) 


Equation  (20)  is  an  exact  representation  of  the  original  criterion  in  (3),  albeit  in  the  vector-matrix  form.  This 
form  of  the  criterion  explicitly  demonstrates  the  linear  relationship  between  the  fitting  error  e  and  a  and  also  the 
nonlinear  relationship  between  e  and  b  through  the  matrix  H|.  From  this  equation,  it  is  also  apparent  that  this 
is  a  mixed  optimization  problem  where  the  linear  and  nonlinear  variables  appear  separately.  In  order  to  decouple 
the  numerator  and  denominator  estimation  problems,  consider  the  following.  If  Hi  (i.e.,  b)  is  known,  then  the 
minimization  of  (20)  will  produce  the  linear  LS  estimate  of  a  as  follows, 


a  A  (WH,)*h£,  (21) 

where,  (WH|)#  A  ((WH|)T(WHj))-1(WH|)T.  In  practice  though,  b  needs  to  be  estimated  also.  Plugging  a 
back  in  (20),  the  optimization  criterion  for  b  is  found  as, 

min||h' -WH,a||2  =  min  flhj  -  WH,(WH,)*faf  ||2 

t(b  b 

=  min||h^ -PwH,h^||2 

=  min||(I|V-PwHl)hi||2,  (22) 

D 

where,  PwHi  A  WH|((WH|)T(WH|))_1(WH|)T,  denotes  the  projection  matrix  of  (WH|).  Note  that  the 
numerator  and  denominator  estimation  problems  are  now  in  decoupled  forms  in  equations  (21)  and  (22),  respec¬ 
tively.  But  in  (22),  the  parameters  in  b  are  related  to  the  error  criterion  in  a  somewhat  complicated  manner 
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through  PwHi  •  Interestingly,  the  operator  (In  -  Pwhi  )  on  h£  in  (22)  is  the  projection  component  in  h“f  that 
is  orthogonal  to  the  subspace  spanned  by  the  columns  of  WH|.  Next  it  is  shown  that  this  orthogonal  space  can 
be  completely  defined  by  the  denominator  coefficients. 

IV.  1  :  Reparameterization 


Let  W/  denote  the  inverse  of  the  DTFT  matrix  W,  i.e.,  W/W  =  In.  This  inverse  exists  as  long  as  the 
frequencies  u>*’s  are  distinct.  In  combination  with  (13b), 


DW/WH*  =  I* 


Use  of  the  partitioned  forms  of  ( 14)  into  (23)  leads  to, 


[all 

r  BjW/WH, 
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(23) 


(24) 


The  bottom-left  corner  element  shows  that  the  N  x  (N  —  q  -  1)  matrix  WjB  and  the  N 
are  orthogonal,  i.e.,  (BtW/)(WHj)  =  0(N-«-i)x(«+i)-  By  construction, 

x  (q  +  1)  matrix  WH/ 

rank(Wj  B)  +  rank(  WHt)  =  N. 

(25a) 

Hence,  using  a  property  of  projection  matrices, 

P  wJ"b  +  Pwhi  =  In- 

(256) 

Using  this  result  in  (22), 

min  ||e»||2  A  min  HPwfB^II2 

(26a) 

=  min  ||Wj’B(BTW/W7’B)-1BrW/h£||2 

b 

(266) 

=  min  hiTWfB(BTW/Wj’B)-1BTW/hi. 
b 

(26c) 

Note  that  this  rtparameterized  criterion  is  directly  related  to  b,  as  desired.  In  order  to  further  simplify  this 

expression,  define  a  vector  z  of  length  N  as, 

z  A  W/b* 

(27) 

such  that  the  criterion  in  (26)  becomes, 

min  zTB(BTW/Wj’B)-1BTz. 

(28) 

It  can  be  easily  shown  that, 

BTz  A  Zb, 

(29a) 

80 


where,  the  matrix  Z  is  constructed  with  the  elements  of  i  as, 


Note  that  this  alternate  form  has  a  weighted-quadratic  structure  which  is  convenient  for  minimization.  Equations 
(21)  and  (30)  represent  the  final  decoupled  estimators  to  be  utilized  in  the  algorithm  described  below.  It  should 
be  emphasized  here  that,  thus  far,  the  theoretical  derivations  are  mathematically  exact,  i.e.,  no  linearization, 
approximation  or  modification  have  been  introduced  at  the  outset. 

Regarding  optimality  properties  of  the  decoupled  estimators,  theoretical  results  in  [13]  can  be  used  to  prove 
that  if  b  is  estimated  by  minimizing  the  criterion  in  (30)  and  if  that  estimate  is  utilized  for  computing  a  using 
(21),  then  the  resulting  estimates  are  the  ttnigve  and  global  minimizers  of  the  criterion  in  (20).  The  advantage 
of  estimating  the  linear  and  nonlinear  parameters  independently  is  reduction  in  computational  load  because  the 
iterative  part  is  only  with  respect  to  the  p  coefficients  in  b.  Based  on  the  optimal  b,  estimation  of  the  optimal  a 
is  a  simple  linear  least  squares  problem.  But  more  importantly,  a  needs  to  be  computed  only  once. 


IV. 2  :  Algorithm 


The  nonlinear  optimization  criterion  in  (30)  possesses  a  very  useful  matrix  structure.  Specifically,  the  expression 
appears  to  be  a  weighted  quadratic  criterion  in  the  unknown  vector  b.  The  matrices  Z  and  W/  are  known.  But 
the  weight  matrix  (BTW/Wj’B)~I  itself  is  dependent  on  the  unknowns  in  B.  The  computational  algorithm  will 
utilize  this  weighted  quadratic  structure  of  the  criterion  to  formulate  the  iterations.  Specifically,  the  algorithm 
minimizes  the  following  quadratic  error  criterion  at  k-tb  iteration  step  : 

min  bT[ZT(BT(t"1)WiWj'B(*-1))-1Z]b  A  min  bTR,b  (31a) 

where,  B^*-1)  is  formed  by  using  the  estimate  of  b  obtained  at  the  previous  iteration  and 
Ri  A  [ZT(BT(*  1)W/Wj’B(*-1))~1Z]  is  the  weight-matrix.  An  initial  estimate  of  b  is  necessary  to  start 
the  iterative  process,  b^  A  [1  0  ••  0]T  can  be  used  or  the  initial  estimates  could  also  be  found  setting  the 
middle  matrix  (BTW/Wj’B)_I  to  identity,  i.e.,  by  optimizing, 

min  bTZTZb  A  min  brRab  (316) 

l>  —  b 

where,  the  weight-matrix  Rj  A  ZTZ.  In  order  to  ensure  non-trivial  solutions,  the  first  term  of  the  denominator, 
6(0)  is  set  to  unity.  The  computational  algorithm  is  similar  in  nature  to  the  time-domain  counterparts  developed 
recently  [8,  16,  17].  As  outlined  in  the  Appendix,  the  algorithm  has  two  phases.  In  Phase-1,  the  criterion  in  (31a) 
is  minimized  by  neglecting  the  variation  w.r.t.  the  weight  matrix.  Simulation  experience  shows  that  this  Phase 
alone  brings  the  error  quite  close  to  the  minimum.  But  if  necescary,  the  variation  of  the  weight  matrix  may  also 
be  included  by  invoking  Phase-2,  where  the  gradient  of  the  entire  criterion  is  set  to  zero.  Once  the  iterations 
converge,  the  estimated  b  is  used  in  (21)  to  linearly  estimate  the  numerator  coefficient  vector  a. 
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V  :  Simulation  Results 


Two  examples  are  included  to  demonstrate  the  effectiveness  of  the  proposed  algorithm.  The  first  example  considers 
a  Lowpass  filter  design  problem  whereas  the  second  one  designs  a  Notch  filter.  In  all  plots  the  frequency  response 
values  are  displayed  up  to  half  the  sampling  frequency.  For  the  proposed  method,  only  the  Phase- 1  results  are 
given. 

Simulation- 1  :  Lowpass  Filter  Design 

Magnitude  response  values  at  56  frequency  points  around  the  unit  circle  were  taken  for  the  matching  purpose.  In 
Fig.  1  the  estimated  response  with  p  =  6  and  q  =  5  for  the  proposed  method  are  shown  by  the  dashed  curve  and 
the  solid  line  represents  the  desired  response.  The  algorithm  converged  in  6  iterations.  For  the  sake  of  comparing 
with  a  widely  used  direct  method,  the  Modified  Yule-Walker  method  [30,  31]  available  in  the  MATLAB  software 
package  was  used  to  design  a  6th  order  filter.  The  magnitude  response  fit  for  this  case  is  shown  as  the  dot-dash 
line  in  Fig.  1. 

Simulatton-S  :  Notch  Filter 

A  Notch  Filter  design  problem  was  considered  in  this  case.  The  magnitude  response  values  at  101  frequency 
points  around  the  unit  circle  were  taken.  The  estimated  response  with  10th  order  denominator  and  9th  order 
numerator  as  produced  by  the  proposed  method  as  well  as  the  desired  response  are  shown  in  dB  scale  in  Fig.  2 
in  dashed  and  solid  lines,  respectively.  The  algorithm  converged  in  1 1  iterations.  The  dash-dot  line  again  shows 
the  fit  when  the  Modified  Yule- Walker  method  [30,  31]  was  used  to  design  the  10th  order  filter. 

Discussion 

The  first  example  has  been  adopted  from  [14,  28].  The  results  presented  above  for  the  proposed  method  did  not 
have  to  make  use  of  any  generic  nonlinear  optimization  technique,  such  as  Newton-Raphson  to  reach  the  final 
optimum.  Also,  during  the  minimization  process,  all  the  coefficients  were  enforced  to  be  real  and  hence  the  filter  is 
readily  realizable.  It  may  also  be  stated  here  that  the  final  designs  were  stabilized  using  the  macro  called  Polystab 
available  in  MATLAB  [29,  30],  where  the  unstable  roots  are  flipped  inside  the  unit  circle.  The  simulations  clearly 
demonstrate  that  the  proposed  method  can  closely  match  arbitrarily  shaped  frequency  response  data  and  it  also 
appears  to  perform  better  than  a  widely  used  method  for  direct  design. 
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Lowpass  Filter  Design 


Fig.  1  :  The  desired  Lowpass  response  is  shown  as  the  solid  line.  The  estimated  responses  using  the  pro¬ 
posed  method  and  the  Yule- Walker  method  are  shown  in  dashed  and  dot-dash  lines,  respectively. 
The  filter  order  is  six. 
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Notch  Filter  Design 


Fig.  2  :  The  desired  Notch  Filter  is  shown  as  the  solid  line.  The  estimated  responses  using  the  proposed 
method  and  the  Yule- Walker  method  are  shown  in  dashed  and  dot-dash  lines,  respectively.  The 
filter  order  is  ten. 
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APPENDIX  :  Computational  Algorithm 


The  criterion  in  (31a)  is  non-linear  in  b  and  hence  it  can  not  be  minimized  directly.  But  instead  of  using  any 
generic  non-linear  optimization  techniques,  the  inherent  mathematical  structure  of  the  criterion  will  be  utilized  to 
develop  an  iterative  computational  algorithm.  The  algorithm  consists  of  two  phases.  In  Phase-1,  the  variations 
in  the  middle  matrix  (BT  W/WjB)  in  (31a)  is  not  taken  into  account  in  the  derivative  calculations,  whereas  in 
Phase-2  the  gradient  of  the  error  norm  in  (31a)  is  set  to  zero. 

Phase- 1 

The  final  form  of  the  error  vector  in  (31a)  is  rewritten  as, 


e*  =  Wj’B(BTW/Wj'B)-1Zb 

(A.  1) 

A  VZb 

(A.  2) 

Av[giG]b 

{A.  3) 

=  Vg  +  VGb, 

(A.4) 

where, 

V  A  WjB(BTWtWjB)-1, 

(A.5) 

and 

b  A  [6(1)  6(2)  ...  6(p)f. 

(A.6) 

If  the  matrix  V  is 

w.r.t.  b  as  follows 

treated  as  independent  of  b,  an  expression  for  b  can  be  easily  obtained  by  minimizing 

INI2 

b  =  -  (VG)#Vg 

=  -  (GTVTVG)"*GrVTVg.  (A.7) 


But  since  V  does  depend  on  the  elements  in  b,  (A.7)  can  only  be  computed  iteratively.  At  the  (i  +  l)-th  step 
of  iteration,  is  formed  using  the  estimate  of  b  found  in  the  z'-th  iteration  step.  This  leads  to  the  following 
iterative  algorithm  for  computing  b,+1  : 


b<<+l> 


1 

— [F^GJ-^F^jg 


(A.  8) 


where, 

F(<)  A  GtVt(,)VW  (A.9) 

The  iterations  are  continued  until  )|b«+i  -  b,j|2  <  6,  where  6  is  an  arbitrarily  small  number.  It  must  be  noted 
here  that  the  iterations  in  (A.8)  may  not  always  converge  to  the  absolute  minimum  of  the  error  criterion  in  (31a) 
and  hence  the  estimated  b  may  not  be  the  optimum  one.  This  is  because  in  (A.8)  the  variability  of  V  w.r.t.  b 
had  been  ignored  while  minimizing  ||e||2.  To  achieve  the  optimum,  the  gradient  of  the  complete  expression  of 
||e||2  must  be  set  to  zero.  If  desired,  this  can  be  done  in  Phase-2  of  the  algorithm  which  is  outlined  next.  It  may 
be  noted  here  that  the  simulation  studies  indicate  that  the  Phase-1  of  iterations  using  (A.8)  perform  an  excellent 
job  of  bringing  the  estimate  very  close  to  the  optimum.  Once  the  estimates  of  b  converge,  a  is  computed  using 
(21). 
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Phase-2 


In  this  phase,  the  derivative  of  the  matrix  V  w.r.t.  b  is  taken  into  consideration  while  minimising  the  fitting 
error  norm.  By  setting  the  derivative  of  the  squared  norm  in  (31a)  to  zero,  we  obtain  the  updated  b(‘+1}  at  the 
(i  +  l)-th  iteration  as, 

b(i+1)  =  -  [S(i)G]-‘[S(<)]g  (.4.10) 


where  (suppressing  the  superscript  (,)), 


S  A  L1  V  +  GtVtV, 


L  A 


av 


db(l) 


Zb 


dv 

db(P) 


Zb 


]• 


(4.11a) 

(4.116) 


^  A  -tJ-ttt  [Wf  B(BTW/Wj B)-1]  =  Wj'^T(BTW/Wj'B)-1  -  WfB 


db(k)  S  db{k) 


db(k) 

+  btW/W  J 


Sill-’ 


0BT 

db(k) 


W/WJb 


T  o\-l 


W/W  /  B) 


(4.11c) 


and  has  the  same  form  as  the  B  matrix  defined  in  (10)  but  it  is  filled  with  all  zeros  except  at  the  locations 
where  6(Jb)  appears.  For  example, 


3B 

db(P)  - 


0 

...  1 

0 

...  r 

0 

...  o 

1 

...  o 

0 

• 

...  0 

0 

l 

0 

...  o 

0 

0 

•  •  * 

0 

.0 

...  o 

0 

...  o. 

€  J 


Once  b*,+1)  is  found  using  (A. 10),  b*,+l)  can  be  formed  as, 


(4.12) 


b  <<+l>  = 


1 

£(<+*> 

i 

— [S^G]-  l(S^jg 


(4.13a) 

(4.136) 


This  minimization  phase  continues  until  b,+1  ~  b'  is  reached  and  this  optimum  b  vector  corresponds  to  a 
minimum  of  the  error  surface  of  jjealla- 
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Section  -  3.3  :  OPTIMAL  ESTIMATION  OF  THE  PARAMETERS  OF  ALL-POLE  TRANSFER 

Functions 

Summary 

An  algorithm  is  proposed  for  optimal  estimation  of  the  parameters  of  Auto-Regressive  (AR)  or  all-pole 
transfer  function  models  from  prescribed  impulse  response  data.  The  transfer  function  coefficients  are  estimated 
by  minimizing  the  /2-norm  of  the  exact  model  fitting  error.  Existing  methods  either  minimize  equation  errors  or 
modify  the  true  non-linear  fitting  error  criterion.  In  the  proposed  method,  the  multidimensional  nonlinear  error 
criterion  has  been  decoupled  into  a  purely  linear  and  a  nonlinear  subproblem.  Global  optimality  properties  of  the 
decoupled  estimators  have  been  established.  For  data  corrupted  with  Gaussianly  distributed  noise,  the  proposed 
method  produces  Maximum-Likelihood  Estimates  (MLE)  of  the  AR-parameters.  The  inherent  mathematical 
structure  in  the  non-linear  subproblem  is  exploited  in  formulating  an  efficient  iterative  computational  algorithm  for 
its  minimization.  The  proposed  algorithm  provides  an  useful  computational  tool  based  on  appropriate  theoretical 
foundation  for  accurate  modeling  of  all-pole  systems  from  prescribed  impulse  response  data.  The  effectiveness  of 
the  algorithm  has  been  demonstrated  with  several  simulation  examples. 

l.  Introduction 

Parameter  estimation  of  unknown  discrete-time  linear  systems  is  a  fundamental  problem  in  digital  signal 
processing.  Parametric  models  overcome  the  infinite  dimensionality  problem  of  non-parametric  models  with 
parsimonious  representation  of  systems  in  terms  of  only  a  finite  number  of  parameters.  Over  the  last  few  decades 
these  problems  have  been  addressed  in  a  large  body  of  work  in  many  different  fields  [1-17,  22,  24-38,  40-47]. 
Among  many  parametric  models  used  in  signal  processing,  Auto- Regressive  (AR)  or  all-pole  model  is  one  of  the 
most  effective  and  practical  representations. 

The  AR-parameter  identification  problem  arises  both  in  stochastic  and  deterministic  time-series  analysis. 
There  are  probably  two  primary  reasons  for  the  wide  popularity  of  AR  modeling  in  statistical  time  series  analysis. 
Firstly,  according  to  Kolmogorov  Theorem,  any  minimum  phase  transfer  function  H(z)  can  be  represented  by  a 
possibly  infinite  order,  stable  minimum  phase  AR-model  [9,  10].  This  important  theorem  implies  that  even  if  an 
AR  model  is  picked  erroneously,  the  unknown  Power  Spectral  Density  can  still  be  matched  closely  as  long  as  a 
‘large  enough’  AR  model  order  is  chosen.  But  the  second  and  the  main  reason  for  the  popularity  of  AR-modeling 
is  that  it  is  possible  to  obtain  reasonably  good  suboptimal  estimates  of  the  unknown  AR-parameters  by  solving  a 
simultaneous  set  of  linear  equations. 

Modeling  human  vocal-tracts  as  all-pole  systems  and  the  corresponding  Speech  signal  as  AR-process  is  one 
of  the  most  important  applications  of  AR-modeling  [4,  8}.  Furthermore,  two  important  modeling  philosophies, 
viz.,  Linear  Prediction  (LP)  and  Maximum  Entropy  methods  essentially  produce  the  AR-parameters  as  their 
estimates,  regardless  of  the  true  underlying  signal  model. 

This  Section  deals  with  the  problem  of  estimating  the  parameters  of  an  all-pole  transfer  function  to  match 
a  prescribed  or  desired  impulse  response  specification.  The  least-squares  Impulse  Response  (IR)  model  fitting 
error  has  been  chosen  as  the  objective  optimality  criterion.  Many  well-known  techniques  developed  for  statistical 
time-series  analysis  have  been  used  successfully  in  the  deterministic  case  also  [7,  10].  AR-model  fitting  may 
be  considered  a  special  case  of  estimating  the  unknown  parameters  of  general  ARMA  (or  pole-zero)  models. 
ARMA  parameter  estimation  is  known  to  be  a  multidimensional  nonlinear  optimization  problem  and  there  have 
been  extensive  work  on  this  subject  [1-7,  10,  12,  13,  15,  16,  24-31,  34-37,  41-43,  47].  In  one  of  the  earlier 
works,  Kalman  [1]  had  proposed  a  linearized  and  approximate  ‘equation  error’  minimization  technique  which 
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produces  suboptim&l  estimates.  Several  two-step  procedures  where  the  denominator  and  numerator  polynomials 
are  estimated  separately,  have  also  been  proposed  [2,  30].  In  these  methods,  the  denominator  is  first  estimated 
by  minimizing  an  equation  error’  and  then  the  numerator  is  found  by  minimizing  a  linearized  ‘fitting  error’  [2] 
or  by  setting  the  leading  error  samples  to  zeros  [30].  A  thorough  coverage  on  filter  design  by  modeling  may  be 
found  in  [7]. 

Equation  error  minimization  is  a  commonly  used  optimization  procedure  for  estimating  AR-parameters. 
In  fact,  the  well-known  linear  prediction  (LP)  coefficients  [7-10]  are  estimated  by  minimizing  equation  errors. 
Linear  predictors  have  extensive  usefulness  in  speech  analysis,  synthesis  and  coding.  Many  practical  and  efficient 
algorithms  are  available  for  LP  parameter  estimation.  Among  these,  the  ‘Autocorrelation  Method’  (AM)  and 
the  ‘Covariance  Method’  (CM)  are  most  popular.  CM  and  AM  do  not  produce  optimal  estimates  in  the  sense 
that  the  model  fitting  error  norm  is  not  minimized  in  either  case.  In  contrast,  Steiglitz  and  McBride  (SMM)  had 
proposed  a  modified  fitting  error  minimization  criterion  for  estimating  the  coefficients  of  general  ARMA  models 
[3,  4,  7].  SMM  has  also  been  adapted  for  AR  parameter  estimation  [7].  In  absence  of  any  exact  model  fitting  error 
criterion,  SMM  has  established  itself  as  the  standard  method  for  AR  and  ARMA  parameter  estimation  problems 
[3,  4,  7,  10,  12,  22,  25,  34,  40,  47].  In  [5],  a  decoupled  exact  fitting  error  minimization  approach  has  also  been 
proposed  by  Evans  and  Fischl  (EFM).  But  their  algorithm  is  applicable  only  in  the  case  of  strictly  proper  ARMA 
models  where  the  number  of  poles  must  be  exactly  one  more  than  the  number  of  zeros.  Consequently,  the  optimal 
EFM  can  be  applicable  for  identifying  first-order  AR-models  only.  The  proposed  optimal  algorithm  has  no  such 
restrictions. 

The  proposed  algorithm  originates  from  a  recently  developed  optimal  method  (OM)  for  general  ARMA 
modeling  [6].  Unlike  EFM,  the  decoupled  fitting  error  minimization  approach  in  [6]  is  applicable  for  ARMA 
models  with  arbitrary  numbers  of  poles  and  zeros.  Furthermore,  in  contrast  to  the  methods  in  [1-4,  24,  30],  no 
linearization  or  modification  of  error  criterion  is  introduced  in  the  theoretical  derivation  of  the  least-squares  model 
fitting  criterion.  In  this  Section,  the  complete  derivation  of  the  optimal  solution  for  the  AR  case  (OM-AR)  is  being 
presented  for  the  first  time.  It  is  also  shown  that  if  the  observation  data  is  composed  of  true  impulse  response 
corrupted  by  Gaussianly  distributed  noise,  then  the  proposed  optimization  produces  the  Maximum-Likelihood 
estimates  (MLE)  of  the  AR  parameters.  For  other  types  of  noise  or  deviations  least-squares  estimates  (LSE)  are 
found. 

A  critical  step  in  the  theoretical  derivation  of  the  error  criterion  is  to  decouple  the  multidimensional  criterion 
into  a  non-linear  problem  for  the  AR-parameters  and  a  linear  problem  for  the  numerator  coefficient.  In  the 
decoupled  form,  the  fitting  error  is  found  to  be  related  to  an  equation  error  which  is  different  than  the  ones  that 
appear  in  CM  or  AM.  But  the  form  of  the  equation  error  is  shown  to  be  mathematically  appropriate  for  the  AR 
case.  The  non-linear  criterion  possesses  inherent  matrix  prefiltering  structure  which  directly  leads  to  formulating 
an  efficient  iterative  computational  algorithm  for  its  minimization.  Several  simulation  examples  demonstrate  the 
superior  performance  of  the  proposed  approach  when  compared  to  some  of  the  existing  suboptimal  methods. 

The  Section  is  arranged  as  follows  :  in  Subsection  II,  the  problem  is  defined,  the  connection  with  MLE  is 
established  and  some  existing  results  are  briefly  outlined.  In  Subsection  III  the  error  criterion  is  theoretically 
derived  for  the  AR  case  and  the  computational  algorithm  is  presented.  In  Subsection  IV,  several  simulation 
examples  are  given.  Finally,  in  Subsection  V,  some  concluding  remarks  are  given. 

n.  Problem  Statement  and  Previous  Results 

The  2-domain  transfer  function  for  an  auto-regressive  model  can  be  represented  as, 


where  the  coefficient  of  the  z°  term  in  the  denominator  has  been  assumed  to  be  unity  without  any  loss  of  generality. 
As  an  example,  H(z)  may  represent  the  transfer  function  of  human  vocal  tract  which  is  commonly  modeled  as 
an  all-pole  model.  The  model  order  p  is  assumed  to  be  known.  In  case  of  speech  signals,  for  example,  a  lot 
of  experience  and  knowledge  is  already  available  and  the  value  of  p  =  10  or  8  is  usually  chosen.  An  equivalent 
representation  of  the  transfer  function  H(z)  can  also  be  written  in  terms  of  its  impulse  response  as, 

H(z)  =  h( 0)  +  h(l)z~l  + - b  h(N  —  2 +  h(N  -  1  )*-<*-»>  +  - .  (2) 


The  first  N  significant  samples  of  H(z)  can  be  stacked  in  a  vector  form  as, 


h  A  [A(0)  h(  1)  •••  h(N-l)f. 


(3) 


Next,  the  vector  containing  the  N  samples  of  the  ‘prescribed’  or  ‘desired’  impulse  response  data  is  denoted  as, 

hpA[M0)  Ml)  •••  hp(N-  1)]T.  (4) 

The  desired  IR  data  vector  may  represent  the  impulse  response  of  vocal  tract.  With  these  definitions,  the  problem 
addressed  in  this  Section  may  be  stated  as  follows  : 

Given  a  desired  impulse  response  hp,  the  goal  is  to  obtain  the  optimal  estimates  of  the  model  parameters 
n0  and  d  by  minimizing  the  following  least-squares  IR  model-fitting  criterion  : 


..-minlMI*  A  ming  [»,(.)  -  {^}<(')]’. 

-  U 

e  A  hp  — h  and 
d  A  [1  di  •  •  ■  dp]T  . 


where, 


(5) 

(5a) 

(56) 

(5c) 


The  notation,  |  |6(i)  denotes  the  response  of  the  system,  when  driven  by  an  input  sequence,  6(i). 

Clearly,  the  criterion  in  (5)  attempts  to  minimize  the  squared  error  between  the  desired  and  the  estimated  IR  and 
hence,  it  can  be  expected  to  produce  more  accurate  model  than  some  well-known  AR  modeling  methods  (outlined 
below)  which  only  minimize  ‘equation  errors’.  The  least-squares  problem  in  (5)  is  known  to  be  nonlinear  in  d  and 
standard  nonlinear  optimization  algorithms  have  been  utilized  before  in  [15,  25-29,  36].  It  should  be  emphasized 
that  if  the  given  IR-vector  hp  is  composed  of  the  true  IR-vector  h  and  additive  Gaussianly  distributed  noise 
or  deviations  then  the  minimization  criterion  in  (5)  is  exactly  equivalent  to  the  maximization  of  the  Likelihood 
criterion  [see  ref.  10,  pp.  242-248].  Hence,  for  such  a  scenario  the  algorithm  proposed  in  this  Section  produces  the 
MLE  of  the  AR-parameters.  For  all  other  types  of  noise  and  deviations  the  Least-Squares  Estimates  are  found. 
It  may  also  be  noted  that  the  MLE  result  in  [10]  is  primarily  based  on  the  works  in  [5,  13,  44]  where  only  the 
strictly  proper  ARMA  case  was  considered.  The  MLE  for  transfer  functions  with  arbitrary  number  of  poles  and 
zeros  has  been  presented  recently  in  [6]. 

In  many  applications,  such  as  in  linear  prediction  of  speech  signals  [4,  8],  only  the  estimation  of  the  AR- 
parameters  is  of  primary  concern.  The  two  most  commonly  used  LP  algorithms,  AM  and  CM,  do  not  solve  the 
ideal  problem  stated  in  (5)  whereas  SMM  attempts  to  solve  the  ideal  problem  by  appropriate  modification  of  the 
criterion  in  (5).  These  three  approaches  are  briefly  summarized  next. 
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Covariance  Method  [CM] 


The  /2-norm  of  the  following  equation  error  is  minimized  [7-10]  : 

hp(p)  hp(p-  1)  •••  hp(  0) 

hp(N-l)  hP(N  -  2)  hP(N-p-  1) 


_  -CM 


or,  Hcwd  A  effM.  (66) 

Note  that  Hew  is  filled  with  available  IR  data  only  and  hence  ef{M  may  be  considered  an  ‘exact’  equation  error. 
Auto-correlation  Method  [AM] 

In  this  case,  the  /2-norm  of  the  following  equation  error  is  minimized  [7-10]  : 


'  MO) 
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0 

0 

Mi) 

MO) 

0 

0 

Mp) 

Mp  - 1) 

MO) 

hP(N  -  1) 

hP{N  -  2) 

.  ♦  . 

...  hp{N-p-  1) 

0 

hp(N  -  1) 

.  .  . 

•••  hp(N-p- 2) 

0 

0 

hP(N  -  1)  . 

-  -AM 

~  > 


or,  H^d  A  e*M.  (76) 

The  zeros  in  the  upper  and  lower  triangles  of  Ham  are  not  part  of  the  prescribed  IR-vector  h/>  and  hence  e*w 
is  not  an  exact  equation  error. 

It  can  be  observed  from  (6)  and  (7)  that  the  equation  error  for  CM  uses  windowed  data  without  making 
any  prior  assumptions  about  the  data  outside  the  observed  window  (6p(0)  . . .  hp(N  —  1)}.  On  the  other  hand, 
AM  uses  unwindowed  data  but  sets  the  data  outside  the  observation  frame  to  zero.  Because  of  this  reason,  AM 
usually  produces  less  accurate  estimates  than  CM.  But  it  should  also  be  noted  that  even  though  is  not  an 
exact  equation  error,  one  of  the  significant  advantages  of  using  AM  is  that  the  computationally  efficient  Levinson- 
Recursion  algorithm  can  be  utilized.  In  case  of  CM,  a  somewhat  less  efficient  algorithm,  Cholesky  decomposition 
can  be  used  [7-10].  Furthermore,  the  AR  coefficients  obtained  by  minimizing  the  norm  of  e^M  produce  a  stable 
transfer  function. 

Steiglitz- McBride  Method  [SMM] 

This  method  was  originally  developed  for  general  ARMA  parameter  identification  but  it  has  also  been  adapted 
for  AR  parameter  identification.  For  the  AR  case,  the  following  modified  fitting  error  criterion  is  optimized  [7], 


aS  -  (zwH’ 


The  estimate  D(z)  obtained  at  any  iteration  step  is  used  as  a  prefilter  for  obtaining  the  updated  estimates  at  the 
succeeding  iteration.  Equation  (8)  closely  approximates  the  criterion  in  (5)  and  both  are  identical  if  D(z)  =  D(z). 
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But  the  advantage  of  using  (8)  is  that  the  unknown  parameters  in  d  and  no  can  be  estimated  by  solving  a  set  of 
simultaneous  linear  equations.  It  may  be  noted  here  that  in  [7],  the  numerator  coefficient  n0  had  been  assumed 
to  be  unity  but,  in  general,  that  may  not  be  the  case.  The  derivation  of  the  proposed  fitting  error  optimization 
scheme  is  in  order. 

III.  Problem  Formulation  and  Algorithm  Development 

In  this  Subsection,  the  multidimensional  optimization  problem  in  (5)  is  decoupled  into  a  linear  estimation 
problem  for  no  and  a  non-linear  optimization  problem  for  d.  Let  Hd(z)  be  the  inverse  filter  corresponding  to 
D(z),  i.e., 

D(z)Hd(z)  =  1.  (9) 

In  time  domain,  this  corresponds  to  a  convolution  operation  where  the  d*’ s  are  finite  and  the  Mn)’s  are  infinite 
in  extent.  The  first  TV  significant  terms  of  this  convolution  operation  may  be  expressed  in  matrix  notation  as, 


DHj  =  Iw 


where,  lyv  denotes  an  TV  x  TV  identity  matrix, 

r  !  0 

U  i 


d  a  :  . 

=  «p  «p-i 


0  d, 
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e  irNxN 
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dp  . 
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MO)  0  •••  0  I 

Mi)  MO)  •••  o 

Ha  A  cm* 

~  MJV-1)  Mtf-2)  •••  MO) 


Using  (9),  the  expression  in  (I)  can  be  rewritten  as, 

H(z)  =  *  "o Hd(z). 

Equating  the  first  TV  coefficients  of  equal  powers  of  z“l  in  both  sides  of  (12)  and  using  vector  notation, 

h  A  nohrf, 

where,  ha  is  also  the  first  column  of  Ha  defined  in  (lib),  i.e., 

h„  A  (MO)  Mi)  •  Mtf-i)f 

With  these  definitions,  the  problem  stated  in  (5)  can  be  rephrased  as, 

min||e||2  A  min||hP  -  n0ha||2, 

n0)d  === 


where,  the  error  vector  is  defined  as, 


e  A  hp  —  noha- 
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It  is  clear  from  (16)  that  the  error  e  is  linearly  related  to  no  whereas  e  is  non-linearly  related  to  d  through  the 
vector  h,*.  In  this  form,  it  is  apparent  that  the  present  problem  belongs  to  a  class  of  mixed  optimisation  problems 
where  the  linear  and  nonlinear  variables  appear  separately.  This  class  of  problems  has  been  studied  extensively 
by  numerical  analysts  [18-21].  In  their  work,  the  main  objective  had  been  to  optimize  the  two  sets  of  variables 
independently.  Their  argument  goes  as  follows.  If  hj  (i.e.,  d)  is  known,  then  no  can  be  optimally  estimated  by 
the  minimization  of  the  criterion  in  (15)  and  the  resulting  least-squares  estimate  will  be  given  by, 

h0  A  h*hP,  (17) 

where  *  denotes  pseudo-inverse  operation  defined  as,  hj  A  (hjh«i)-1hj.  In  practice,  d  will  not  be  known  and 
it  has  to  be  estimated.  Plugging  ho  back  in  (15),  the  optimization  criterion  for  d  can  be  found  as, 

min||hP  -  n0lidi|3  =  min||h/>  -  (M»f)h|>||J  (18a) 

fi0>d  a 

=  min ||(Iyv  -  PhJhp||2,  (186) 

a 

where  denotes  projection  matrix  defined  as,  Ph<  A  hj(hJhj)_1hJ.  For  a  larger  class  of  multidimensional 
nonlinear  optimization  problems,  it  has  been  proved  in  Theorem  2.1  of  [18]  that  if  d  is  estimated  by  minimizing 
the  criterion  in  (18)  and  if  that  estimate  is  utilized  for  computing  ho  using  (17),  then  the  resulting  estimates 
are  the  unique  and  global  mtnimtzers  of  the  criterion  in  (15).  Hence,  the  original  optimization  problem  in  (5) 
is  identical  to  the  decoupled  estimators  in  (17)  and  (18).  This  type  of  decoupled  optimization  of  linear  and 
non-linear  subproblems  had  been  utilized  before  in  [5,  13,  45,  46]  for  strictly  proper  ARMA  case  and  in  [6]  for 
the  general  ARMA  case.  The  derivation  for  the  AR  case,  as  given  here,  appears  to  be  new. 


The  AR-parameters  in  d  are  related  to  the  error  criterion  in  a  complicated  manner  through  Ph<  Hence,  the 
direct  optimization  of  (18)  w.r.l.  d  would  require  taking  resort  to  standard  non-linear  optimization  techniques 
such  as  Newton- Raphson  or  Gauss-Newton  methods.  Instead,  following  the  strategy  used  in  [6]  for  the  general 
ARMA  case,  the  criterion  in  (18)  is  reparameterized  by  relating  it  directly  to  the  coefficients  in  d.  Appropriate 
partitioning  of  the  matrices  D  and  Hj  gives, 
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Using  these  notations,  the  expression  in  (10)  can  be  rewritten  as, 
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The  bottom-left  corner  element  shows  that  the  JV  x  (N  -  1)  matrix  B  and  the  vector  h<j  are  orthogonal,  t.e., 
Brh<f  =  0.  Also  by  construction, 

rank(  B)  +  rank(h4)  =  N.  (20) 

Hence,  using  a  theorem  on  projection  matrices  (39], 

Pb  +  Pa,  =  I  a  (21) 

Using  this  relationship  in  (18b),  the  following  equivalent  forms  of  reparameterized  optimization  criterion  are 
obtained, 


where,  eCf  is  an  equation  error  defined  as, 

et,  A  BTh 

This  equation  error  can  also  be  rewritten  as, 

r  Ml)  MO) 


=  min||PBhp||2, 

a 

(22a) 

=  min||B(BrB)-1BThP||2, 

d 

(226) 

=  min||B(BTB)-1e«f||2, 

a 

(22c) 

=  mjneff(BTB)_1eM, 

(22  d) 

A  BThP. 

(23) 

e„  =  BThp  = 


hp(N-l)  hp(N  -  2)  •••  hp(N-p-l) 


A  HAfid.  (246) 

A  few  observations  may  be  made  here  regarding  the  equation  error  defined  in  (23).  Clearly,  ee,  differs  from  the 
equation  errors  used  in  CM  and  AM  as  defined  in  (6)  and  (7),  respectively.  The  equation  errors  in  those  cases 
were  formed  in  somewhat  ad  hoc  manner  on  the  basis  of  two  types  of  autocorrelation  estimates  [7,  9,  10].  Od 
the  other  hand,  the  particular  form  of  equation  error  in  (24a)  resulted  from  purely  mathematical  consequences 
of  the  AR  case  under  consideration.  In  particular,  if  the  prescribed  response  hp  happens  to  be  an  exact  impulse 
response  of  a  p-th  order  AR  transfer  function,  then  the  equation  error  in  (24a)  will  be  identically  equal  to  zero, 
but  the  same  will  not  be  true  for  e*qM  in  (7).  The  equation  error  for  CM  appearing  in  (6)  will  also  be  zero  but 
ignores  the  information  contained  in  the  upper  (p  -  1)  equations  of  (24a).  From  this  discussion,  it  can  be 
concluded  that  more  accurate  estimates  may  be  obtained  if  the  equation  error  in  (23)  is  used  for  the  AR  case. 
Minimization  of  this  equation  error  will  be  utilized  later  in  the  computational  algorithm  for  obtaining  the  initial 
estimate  of  d. 

Using  (24b)  in  (22d),  the  reparameterized  criterion  can  be  expressed  in  the  following  useful  form, 

mindTH5H(BTB)-,H4fid.  (25) 

<1 

According  to  Theorem  2.1  of  [18],  the  denominator  vector  d  causing  the  minimum  of  the  criterion  in  (25)  is  the 
desired  optimum  d®.  The  minimized  error  can  then  be  found  from, 

e®  =  Pb-M  (26) 
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where,  B*  is  constructed  by  using  the  optimum  d®.  The  optimum  estimate  of  the  impulse  response  is  then, 

h®  =  hp  -  e®.  (27) 

From  [18],  it  can  be  also  be  inferred  that  if  hg  is  formed  using  d®  then  the  optimal  ng  can  be  obtained  using  (17) 
as, 

ng  A  hg#hp.  (28a) 

Next  it  is  shown  that  instead  of  using  (28a),  the  optimal  numerator  coefficient  can  be  found  in  a  more  straight¬ 
forward  manner  as, 

ng  =  h°(  0),  (28 6) 

where  h®(0)  is  the  first  sample  of  h®,  the  optimal  impulse  response  estimate  found  in  (27).  In  order  to  demonstrate 
this  equivalence,  equation  (28b)  is  rewritten  as, 


ng  =  hf  d®,  (29a) 

where, 

Note  that  the  first  term  in  d  is  always  1 
rewritten  as, 


n$  =  djh®,  (30a) 

=  dj(hp  -  e®),  using  (27),  (306) 

=  d £(hp  -  hp  +  hghg#hp),  using  (18a)  and  (26)  (30c) 

=  (d^h®d)hg#hP,  (30d) 

=  hg#hp,  (30e) 


where,  the  last  equality  uses  the  fact  that,  d„h®«j  =  1,  which  appears  in  the  upper-left  partition  of  (19c).  This 
completes  the  proof  of  equivalence  between  the  expressions  in  (28a)  and  (28b).  It  should  be  noted  that  (29)  may 
be  preferable  over  (28a)  for  computing  no  because  the  computation  of  hg  and  the  pseudo-inverse  solution  required 
in  (28a)  can  be  avoided,  whereas  calculation  of  the  optimal  h°  in  (27)  may  be  a  necessary  step.  Equations  (22)  and 
(29)  are  the  two  desired  decoupled  formulae  for  estimation  of  the  coefficients  of  the  denominator  and  numerator 
polynomials  of  the  AR-model.  It  should  be  mentioned  that  unlike  the  decoupled  forms  of  SMM  given  in  [7]  and 
[34],  no  approximations  were  introduced  in  deriving  the  decoupled  estimators  in  (22)  and  (29).  A  computational 
algorithm  for  minimization  of  the  criterion  in  (22)  is  outlined  next. 

Computational  Algorithm 

The  criterion  in  (22)  is  non-linear  in  d  and  hence  it  can  not  be  minimized  directly.  Standard  gradient-based 
non-linear  optimization  techniques  such  as  Newton- Raphson  or  Gauss-Newton  algorithms  could  be  used.  But 
these  algorithms  utilize  only  the  first  few  terms  of  Taylor  series  and  are  known  to  be  highly  sensitive  to  the  choice 
of  the  initial  estimates.  But  it  can  be  observed  from  (25)  that  the  error  criterion  possesses  a  good  deal  of  matrix 
structure.  Specifically,  the  expression  appears  to  be  a  weighted  quadratic  criterion  in  d,  except  that  the  weight 
matrix  (BtB)-1  itself  is  dependent  on  the  unknowns  in  d.  This  inherent  mathematical  structure  of  the  criterion 
will  be  utilized  to  develop  an  iterative  computational  algorithm.  The  algorithm  is  similar  to  the  ones  for  ARMA 
cases  appearing  in  [5,  6,  13].  Here  the  complete  derivation  for  the  AR  case  will  be  given. 


h?  A  [h®(0)  0  Of.  (296) 

Using  the  partitioning  notation  in  (19a),  equation  (29a)  can  also  be 
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In  order  to  initiate  the  iterative  algorithm,  d  is  first  estimated  by  minimizing  the  f2-norm  of  the  equation 
error  e,f  defined  in  (24a).  Partitioning  H^r,  the  equation  error  e<}  can  be  rewritten  as  follows, 


ee<  =  BThp  A 

=  [*  • 


Mi)  I  MO)  •••  o 

:  I  :  : 

hP(N- 1)  |  hP(N-  2)  •••  hP{N-p- 1) 


■  1  ■ 

di 

d2 

.dp. 

d. 


(31a) 


(316) 


Minimizing  ||eef  ||3  w.r.t.  d,  the  following  initial  estimate  is  obtained 

1 

d<°>  = 


-G*g\ 


(32) 


This  estimate  will  be  utilized  for  initiating  the  iterative  computational  algorithm.  The  final  form  of  the  error 
vector  in  (22a)  is  rewritten  as, 


e  =  B(BTB)-1BThP  (33a) 

A  WBTh/>  using  (24),  (336) 

=  WHyifid  using  (31b),  (33c) 

=  W  fg  :  gJ  d  (33d) 

=  Wg  +  WGd,  (33e) 

where, 

W  A  B(BtB)-1,  and,  (33/) 

d  A  [di  d2  •••  dpf.  (33 g) 


If  the  matrix  W  is  treated  as  independent  of  d,  minimization  of  ||ej|2  w.r.t.  d  results  in  the  following  estimate  : 


d  =  -  (WG)#Wg 

=  -  (GtWtWG)_1  GTWTWg.  (34) 


But  W  does  have  dependence  on  the  elements  in  d  and  hence  the  estimate  in  (34)  can  only  be  computed  iteratively. 
The  estimate  of  d  found  in  the  t-th  iteration  step  is  used  in  (33f)  to  form  which  is  then  utilized  at  the 
(t  +  l)-th  step  of  iteration  to  compute  d,+1  as  follows  : 


where, 


d«+D  = 


1 

-[xWGi-Hxwjg 


XC>  A  GtWt(<)W^> 

=  Gr(BT(,)B(,))~1. 


(35a) 
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(356) 

(35c) 


The  iterations  are  continued  until  ||di+i  —  dj|)2  <  6,  where  6  is  an  arbitrarily  small  number. 

It  must  be  noted  here  that  the  iterations  in  (35)  may  not  always  converge  to  the  absolute  minimum  of 
the  error  criterion  in  (5)  and  hence  the  estimated  d  may  not  be  the  optimum  one.  This  is  because  in  (35)  the 
variability  of  W  w.r.t.  d  has  been  ignored  while  minimizing  ||e||2.  To  achieve  the  optimum,  the  gradient  of  the 
complete  expression  of  ||e||2  must  be  set  to  zero.  If  desired,  this  may  be  done  in  a  second  phase  of  the  algorithm 
which  is  outlined  in  the  Appendix.  Invoking  Phase-2  will  assure  that  at  least  a  local  minimum  will  be  achieved. 
But  it  may  be  noted  here  that  the  simulation  studies  indicate  that  the  Phase- 1  of  iterations  using  (35)  does 
an  excellent  job  of  bringing  the  estimate  very  close  to  the  optimum.  It  will  be  shown  in  Subsection  IV  that 
the  Phase-2,  if  invoked,  causes  almost  insignificant  changes  in  the  d  vector  and  the  minimized  error  norm.  In 
simulations,  the  convergence  was  found  to  be  quite  rapid  in  both  the  phases.  Once  the  estimates  of  d  converge, 
n0  is  found  using  (26),  (27)  and  (29),  in  sequence. 

Discussion 

The  major  computational  burden  of  the  algorithm  is  in  performing  the  iterative  refinement  in  (35),  where, 
at  each  iteration  step  an  (N  —  1)  x  (N  —  1)  matrix  (BTB)  needs  to  be  inverted.  But  (BTB)  is  a  banded  and 
symmetric  matrix  which  can  be  inverted  using  computationally  efficient  Cholesky  decomposition  [8,  10].  Further 
reduction  in  computation  is  also  possible  because  though  (BTB)  is  not  purely  Toeplitz,  a  major  (N  —  p)x  (N  —  p) 
diagonal  block  is  symmetric-banded-Toeplitz  and  this  block  can  be  inverted  with  0[(7V  —  p)  log(N  -  p)]  +  0\p*] 
operations  [23].  The  other  (p  —  1)  x  (p  —  1)  diagonal  block  is  symmetric  and  can  be  inverted  with  0[(p  —  l)2] 
operations.  Furthermore,  the  non-diagonal  blocks  contain  mostly  zero  elements.  Hence,  using  the  block  matrix 
inversion  formula  due  to  Schur  [48],  this  matrix  inversion  can  be  computed  with  less  than  0[(N  —  l)2]  operations. 
It  may  also  be  noted  that  in  case  of  SMM  the  calculation  of  IR  of  the  inverse  filter  and  data  filtering  are  required 
at  every  step  of  iteration,  whereas  the  proposed  method  uses  the  estimated  d  directly  to  form  the  B  matrix. 

The  LS  error  criterion  defined  in  (5)  attempts  to  match  only  the  first  N  available  samples  of  hp(n).  No  explicit 
assumption  has  been  made  about  the  unobserved  samples,  but  the  estimated  rational  transfer  function  essentially 
extends  the  impulse  response  beyond  the  observations.  It  may  be  noted  here  that  minimum  phase  property  can 
not  be  guaranteed  with  the  AR-parameter  estimates  produced  by  the  proposed  algorithm.  Extensive  simulation 
studies  indicate  though  that  with  converging  IR  sequences,  the  algorithms  always  produced  stable  solutions.  It 
should  also  be  pointed  out  that  among  existing  methods,  only  the  autocorrelation  method  can  guarantee  stable 
solutions.  But  AM  uses  windowed  data  and  the  IR  fit  with  the  estimates  is  usually  not  very  accurate  because  the 
original  least-squares  IR  error  criterion  is  not  minimized.  To  ensure  minimum  phase  solution,  AM  can  be  used 
(instead  of  (32))  to  obtain  the  initial  estimates  for  starting  the  iterative  AR-algorithm.  If  the  estimates  obtained 
from  the  iterative  scheme  becomes  maximum  phase  at  any  iteration  step  of  the  AR-algorithm,  the  iterations  can 
be  terminated  at  that  stage.  The  estimate  found  at  the  preceding  iteration  should  be  accepted  as  the  best  possible 
minimum  phase  solution  that  minimizes  the  optimal  LS  criterion  in  (5). 

The  model  order  selection  problem  has  not  been  addressed  in  this  work.  It  appears  that  for  this  essentially 
deterministic  problem,  Akaike  Information  Criterion  (AIC)  or  Minimum  Description  Length  Criterion  (MDL) 
may  not  be  applicable.  But  these  criteria  may  be  utilized  when  the  prescribed  impulse  response  data  consists  of 
true  impulse  response  embedded  in  Gaussianly  distributed  noise. 

The  algorithm  presented  in  this  Section  may  also  be  quite  useful  for  estimating  MA  filter  coefficients. 
Presently,  the  most  effective  algorithm  for  MA  modeling  is  Durbin’s  method  [11]  which,  in  fact,  relies  on  two 
steps  of  AR  parameter  estimation.  Traditionally,  AM  is  utilized  in  both  steps  of  Durbin’s  method  because  it 
produces  minimum-phase  polynomials  [7,  9-11].  But  the  estimates  obtained  using  AM  may  not  be  optimal  be¬ 
cause  the  true  impulse  response  fitting  error  norm  is  not  minimized.  But  the  algorithm  presented  here  produces 
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optimal  least-squares  AR  filter  coefficients  from  prescribed  impulse  response  data.  Hence,  it  can  be  expected  that 
the  introduction  of  the  proposed  AR  algorithm  in  one  or  both  stages  of  Durbin’s  algorithm  may  produce  more 
accurate  MA  parameter  estimates. 

It  has  been  shown  in  [7]  that  the  original  SMM  can  also  be  decoupled  into  a  linear  and  a  non-linear  subprob¬ 
lems.  In  a  recent  paper  [34],  the  strictly  proper  case  of  the  original  SMM  has  been  decoupled  somewhat  differently 
than  in  [7].  But  more  interestingly,  it  has  also  been  demonstrated  that  the  non-linear  part  of  the  decoupled  SM 
criterion  has  exact  mathematical  equivalence  with  the  optimal  EFM  criterion  in  [5).  It  appears  that  using  the 
new  definitions  of  the  matrices  resulting  from  the  matrix  partitioning  in  (19),  the  AR- version  of  SMM  as  given 
in  (8)  can  also  be  decoupled  into  linear  and  nonlinear  subproblems.  This  equivalence  may  have  an  important 
consequence  for  the  proposed  algorithm.  There  already  exists  a  convergence  analysis  of  the  original  SM  method 
in  [47].  It  can  be  hoped  that  the  convergence  analysis  will  also  apply  to  the  decoupled  form  of  SM  method  given 
in  [34].  If  that  happens  to  be  the  case,  as  alluded  to  in  [34],  the  convergence  analysis  in  [47]  should  also  apply  to 
the  iterative  computational  algorithm  presented  in  this  work.  It  should  be  noted  though  that  the  results  of  SMM 
and  the  proposed  optimal  method  may  not  be  identical.  Specifically,  the  numerator  in  the  decoupled  form  of  [34] 
is  computed  somewhat  differently  than  (26)  which  is  the  optimal  estimate.  Furthermore,  it  should  be  also  added 
that  the  iterative  scheme  in  (35a)  is  not  the  only  possible  approach  for  iterative  minimization  of  the  equivalent 
criterion  in  (22).  In  fact,  removing  the  requirement  of  do  =  1,  the  eigenvector  corresponding  to  the  minimum 
eigenvalue  of  the  matrix  H^fl(BT^'  B(*))~1Haji  may  also  be  used  as  d^,+1\  the  estimate  at  the  (»+ l)-th  iteration 
step  [50].  This  possibility  is  not  obvious  from  the  original  SMM  algorithm  in  [3]. 

rv.  Simulation  results 

In  this  Subsection,  the  performance  of  the  proposed  algorithm  is  evaluated  by  means  of  several  AR(p)  model 
identification  examples  with  different  p  values.  6  —  10"6  was  used  as  the  stopping  criterion  in  both  phases  of  the 
algorithm  for  all  the  examples  below.  The  fitting-error  norm  defined  in  (5)  was  calculated  at  convergence  using 
the  estimated  parameters  and  the  results  are  tabulated  in  the  ‘Minimized  Error  Norm’  column.  Furthermore,  in 
order  to  get  a  relative  sense  of  performance,  the  logarithm  of  the  ratios  of  the  powers  of  the  ‘true  IR’  (known  in 
these  simulations)  and  the  error  powers  are  also  tabulated  in  the  ‘Closeness  in  dB’  columns. 

Simulation  1  : 

The  desired  impulse  response  has  a  Triangular  form  as  shown  by  the  solid  lines  in  Fig.  1A  -  ID.  The 
resulting  impulse  response  fit  using  Covariance  method  and  Auto-correlation  method  are  shown  as  connected 
circles  in  Figures  IA  and  IB,  respectively.  The  impulse  response  match  at  the  end  of  each  of  the  two  phases 
of  the  algorithm  described  in  Subsection  III  with  p  =  4  are  shown  in  Fig.  1C  and  Fig.  ID,  respectively.  The 
minimized  error  norm  and  the  closeness  of  the  fit  to  the  desired  signal  hp  are  listed  in  Table  1.  The  number  of 
iterations  for  convergence  are  also  listed.  It  can  be  seen  from  the  table  and  the  figures  that  compared  to  AM 
and  CM,  the  proposed  scheme  provided  more  accurate  estimates.  But  it  may  also  be  observed  that  there  is  no 
significant  difference  in  the  results  between  the  1st  and  the  2nd  phase  of  the  proposed  algorithm. 
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Table  1:  Example  1:  Comparison  of  three  methods  with  Triangular  Impulse  Response 


Method 

Closeness 
in  dB 

Minimized 
Error  Norm 

Number  of 
Iterations 

Covariance 

6.359 

154.9 

Auto-Correlation 

6.674 

144.093 

Proposed 

23.436 

3.037 

5 

Phase- 1 

Proposed 

23.439 

3.035 

3 

Phase-2 

Simulation  2 : 

An  arbitrary  impulse  response  was  generated  with  p  =  5  for  these  simulations.  If  the  algorithm  in  Subsection 
III  is  used  directly  to  match  the  true  response  it  will  give  perfect  results.  Instead,  Gaussianly  distributed  white 
noise  was  added  to  the  true  response  to  obtain  the  desired  response  h/>.  Hence,  the  estimates  obtained  with  the 
proposed  algorithm  will  also  be  the  MLE  of  the  unknowns.  For  20dB  noise,  the  true  and  the  desired  responses 
are  shown  in  Fig.  2A.  The  impulse  response  match  using  Covariance  method  and  Autocorrelation  method  are 
shown  in  Figures  2B  and  2C,  respectively.  The  initial  estimate  obtained  by  minimizing  the  equation  error  in 
(24a)  is  shown  in  Fig.  2D.  The  impulse  response  fit  obtained  using  the  proposed  algorithm  at  the  end  of  Phase- 1 
and  Phase-2  are  shown  in  Fig.  2E  and  Fig.  2F,  respectively.  The  minimized  error  norms  and  the  closeness  to  the 
true  response  are  listed  in  Table  2.  It  can  be  observed  for  this  example  that  there  is  about  3dB  difference  in  the 
impulse  response  fit  between  the  two  phases  though  the  difference  in  the  minimized  error  norms  is  quite  small. 
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Table  2  :  Example  2  :  Comparison  of  three  methods  with  5-th  order  Impulse  Response 


Method 

Closeness 
in  dB 

Minimized 
Error  Norm 

Number  of 
Iterations 

Covariance 

1.027 

1.847 

Auto-Correlation 

12.442 

0.691 

Equation  Error 

15.459 

0.656 

in  (24a) 

Proposed 

17.889 

0.646 

5 

Phase- 1 

Proposed 

20.883 

0.634 

4 

Phase-2 

FVom  these  simulation  results  a  fair  conclusion  may  be  drawn  that  the  Phase- 1  of  the  algorithm  does  an 
excellent  job  of  error  minimization.  Hence,  the  Phase-2  of  the  algorithm  need  not  be  invoked  for  most  applications. 
The  results  using  SMM  are  close  to  the  results  at  the  end  of  Phase-1  if  the  original  SMM  [3]  or  the  decoupled 
form  in  [34]  are  used.  There  were  some  numerical  differences  in  the  coefficients  but  the  impulse  response  fit  looked 
almost  alike.  The  results  with  the  AR-version  of  SMM  given  in  [7]  were  poorer  than  Phase-1  results  because  the 
numerator  coefficient  is  set  to  1  in  [7],  Extensive  simulations  with  other  examples  show  equivalent  performance. 
Interestingly,  the  simulations  also  indicate  that  the  proposed  algorithm  is  quite  immune  to  the  choice  of  initial 
estimates.  In  fact,  when  CM  or  AM  were  used  in  place  of  (32)  for  obtaining  the  initial  estimates,  the  results 
obtained  at  the  end  of  Phase-1  or  Phase-2  turned  out  to  be  exactly  identical  to  the  results  listed  in  the  Tables.  But 
with  Covariance  method  as  initial  estimate,  the  Phase- 1  sometimes  took  one  or  two  extra  iterations  to  converge. 
This  important  observation  indicates  the  robustness  of  the  proposed  algorithm  to  the  choice  of  initial  estimates. 

V.  Concluding  Remarks 

In  this  Section,  a  classical  rational  model  identification  problem  has  been  addressed.  The  major  focus  was 
to  develop  an  algorithm  for  optimal  estimation  of  the  parameters  of  an  all-pole  transfer  function  with  arbitrary 
number  of  poles  by  model-fitting  a  prescribed  impulse  response.  Unlike  some  existing  results,  no  linearization 
or  approximation  has  been  done  while  deriving  the  theoretical  optimization  criterion.  It  is  shown  that  the 
multidimensional  non-linear  problem  can  be  decoupled  into  two  smaller  problems  of  which  one  is  a  linear  problem 
and  the  other  one  is  a  non-linear  problem.  The  inherent  mathematical  structure  of  the  non-linear  part  is  utilized 
to  formulate  an  efficient  iterative  computational  algorithm  for  estimating  the  denominator  parameters.  Global 
optimality  properties  of  the  estimators  have  been  confirmed  by  relating  the  multidimensional  optimization  problem 
to  certain  well-known  results  in  numerical  analysis.  In  simulation  studies  also,  the  method  has  been  shown  to 
be  highly  effective.  Regarding  possible  future  work,  it  may  be  noted  that  most  of  the  existing  suboptimal  1-D 
algorithms  have  been  extended  for  estimating  2-D  filter  coefficients  from  2-D  spatial  domain  data  [22,  29,  35, 
36,  40-43].  The  possibility  of  formulating  an  optimal  2-D  AR-filter  design  technique  by  extending  the  proposed 
method  is  being  studied  [49].  Extension  of  this  work  for  identification  of  Multidimensional  AR-systems  from 
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multidimensional  impulse  response  data  [38]  is  also  under  progress. 
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Fig.  1  :  Simulation  1  A  triangular  impulse  respoi 
used  in  both  phases.  The  solid  lines  denote 
circles  show  the  fit  with  (A)  Covariance  Met 
convergence  and  (D)  after  Phase-2  converge 


Fig.  2  :  Simulation  2  20dB  noise  was  added  to  a  true  AR(5)  impulse  response  to  form  the  desired 
response  (h<j).  6  —  10-6  was  used  as  stopping  criterion.  The  solid  lines  denote  the  true  impulse 
response.  In  Fig.  2A,  the  connected  circles  show  the  noisy  signal  hj.  In  the  other  plots  the 
connected  circles  show  the  fit  with  (B)  Covariance  Method,  (C)  Autocorrelation  Method,  (D) 
minimization  of  Equation  error  in  (22f),  (E)  Phase- 1  and  (F)  Phase-2  convergence. 
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APPENDIX  Computational  Algorithm  :  Phase  II 


The  second  phase  of  the  iterative  algorithm  is  described  in  this  Appendix.  In  this  phase,  the  derivative  of  the 
matrix  W  vo.r.i.  b  is  taken  into  consideration  while  minimizing  the  fitting  error  norm.  The  complete  expression 
of  the  /j-norm  of  the  error  can  be  written  as, 


||e||2  =  eTe  =  (Wg  +  WGd)T(Wg  +  WGd).  (AA) 

By  setting  the  derivative  of  this  squared  norm  to  zero,  the  updated  b^'+l^  at  the  ( i  +  l)-th  iteration  is  given  by, 


where  (suppressing  the  superscript  0)), 


b«+‘)  =  -  [U(<)G]_1[U(<)]g 

U  A  LtW  +  GtWtW, 


aw 

aw  i 

.Mr6** 

H e"  ] 

(A.  2) 

(A.2a) 

(A.2b) 


106 


This  minimization  phase  continues  until  b,+l  ~  b‘  is  reached  and  this  optimum  b°  vector  corresponds  to  a 
minimum  of  the  error  surface  of  ||e|j^. 
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Section  -  3.4  :  DESIGN  OF  DENOMINATOR-SEPARABLE  2-D  HR  FILTERS 
SUMMARY 

Optimal  design  of  an  important  class  of  two-dimensional  (2-D)  digital  IIR  filters  from  spatial  impulse  response 
data  is  addressed.  The  denominator  of  the  desired  2-D  filter  is  assumed  to  be  separable  into  two  1-D  factors. 
The  filter  coefficients  are  estimated  by  minimizing  the  /2-norm  of  the  error  between  the  prescribed  and  the 
estimated  spatial  domain  responses.  The  denominator  and  numerator  estimation  problems  are  theoretically 
decoupled  into  separate  problems.  The  decoupled  criteria  have  reduced  dimensionality.  The  denominator  criterion 
is  simultaneously  optimized  w.r.t.  the  coefficients  in  both  dimensions  using  an  iterative  algorithm.  The  numerator 
coefficients  are  found  in  a  straight-forward  manner.  If  the  desired  response  is  known  to  be  symmetric,  the 
proposed  algorithm  can  be  constrained  to  produce  optimal  denominators  which  are  identical  in  both  domains. 
The  performance  of  the  algorithm  is  demonstrated  with  simulation  examples. 

I.  Introduction 

Two-Dimensional  IIR  filters  are  commonly  used  in  image  processing  and  2-D  filtering.  Synthesis  of  such 
filters  from  prescribed  spatial  domain  impulse  response  data  is  an  important  and  challenging  design  problem  and 
has  received  considerable  attention  in  recent  literature  [1,  2,  4,  5,  8,  12,  14,  16].  Spatial-domain  design  of  2-D 
IIR  filters  is  analogous  to  1-D  recursive  filter  design  based  on  time-domain  specifications.  Most  2-D  filter  design 
algorithms  are  basically  extensions  of  existing  1-D  algorithms.  In  particular,  Shanks  et  al  [12]  had  extended  the 
work  of  Shanks  [11];  Cadzow  [1]  and  Shaw  and  Mersereau  [16]  utilized  many  of  the  general  non-linear  optimization 
methods;  and  Shaw  and  Mersereau  [16]  also  extended  the  work  of  Steiglitz  and  McBride  [17].  The  1-D  work  of 
Mullis  and  Roberts  [7]  was  further  extended  and  applied  to  the  2-D  case  in  [5]. 

The  approaches  noted  above  do  not  minimize  the  true  spatial  impulse  response  error,  though  it  may  be 
mentioned  that  the  extension  of  Steiglitz-McBride  method  in  [16]  closely  approximates  the  true  fitting  error.  For 
the  strictly-proper  case,  i.e.,  when  the  numerator  order  is  one  less  than  that  of  the  denominator,  Evans  and  Fischl 
(EFM)  had  proposed  an  optimal  method  for  synthesis  of  1-D  IIR  filters  [3].  The  2-D  filter  synthesis  algorithm 
presented  here  is  a  generalization  of  EFM  to  2-D.  Proposed  solution  is  optimal  in  the  sense  that  it  minimizes  true 
and  complete  spatial  error  criterion  for  the  design  of  strictly-propei  2-D  IIR  filters. 

EFM  has  been  found  to  be  highly  accurate  for  1-D  filter  design.  A  modified  complex  version  of  the  EFM 
with  certain  symmetry  constraints  has  also  been  shown  to  be  effective  for  maximum-likelihood  1-D  and  2-D 
frequency-wavenumber  estimation  [  8,  13,  15].  Generalization  of  EFM  for  strictly-proper  2-D  filter  design  has  also 
been  considered  previously  [4,  5],  but  it  appears  that  the  full  potential  of  EFM  has  not  been  utilized  in  the  2-D 
case.  Specifically,  it  will  be  shown  that  the  complete  error  criterion  encompassing  the  entire  subspace  orthogonal 
to  the  model  fitting  error  was  not  optimized  in  [4,  5].  Instead,  two  subop timal  error  criteria  were  formed  in  each 
domain  and  the  filter  coefficients  were  optimized  in  the  two  dimensions  independently. 

In  this  Section,  a  2-D  version  of  EFM  is  developed  for  optimal  design  of  2-D  recursive  filters  from  prescribed 
spatial  domain  data.  The  complete  basis  space  orthogonal  to  the  spatial  fitting  error  will  be  identified  and  the 
corresponding  error  criterion  will  be  shown  to  be  dependent  only  on  the  2-D  filter  parameters.  Similar  to  1-D 
EFM,  the  non-linear  error  criterion  will  be  decoupled  into  a  purely  linear  and  a  non-linear  sub-problem.  For 
the  separable  denominator  case,  it  is  also  shown  that  the  error  vector  possesses  a  quasi-linear  relationship  with 
the  denominator  coefficients  in  both  domains  simultaneously.  Unlike  several  existing  2-D  methods  [1,  3,  4,  5], 
the  exact  fitting  error  is  minimized  w.r.t.  the  filter  coefficients  in  both  dimensions  simultaneously.  Simultaneous 
optimization  is  particularly  effective  for  synthesizing  2-D  filters  with  symmetric  impulse  response  which  are  quite 
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common  in  practice.  In  such  cases,  the  criterion  can  be  constrained  to  produce  identical  denominators  in  both 
domains  ensuring  symmetry  in  the  estimated  spatial  response. 

The  Section  is  arranged  as  follows  :  In  Subsection  II,  the  least-squares  problem  is  stated.  In  Subsection  III, 
the  preliminaries  for  the  non-separable  numerator  and  separable  denominator  case  is  given.  In  Subsection  IV, 
the  new  orthogonal  basis  spaces  are  defined,  the  error  criterion  is  derived  and  the  computational  algorithm  is 
summarized.  In  Subsection  V,  some  simulation  results  are  given  and  finally,  in  Subsection  VI  some  concluding 
remarks  along  with  directions  for  future  work  are  included. 

II.  Problem  Statement  and  Formulation 


In  general,  a  2-D  rational  function  H(zi,  22),  with  non-decomposabie  numerator  and  denominator  is  described  as 

(1) 


HI  \  —  Z2J 

(2"2”-p(z I,*)  ■  Z?J0Z?Jo FfriWV 


Note  that  for  the  strictly-proper  case  of  EFM,  ni  =  mi  —  1  and  nj  =  mj  —  1.  If  the  ifci  x  *2  first  quadrant  samples 
are  assumed  to  be  significant,  H(z  1, 22)  can  also  be  written  as, 


H(zi,Zt)  =  *1*11*2 

where,  z\  A  [1  zf1  •  •  •  z^fc,-1*]T,  *2  A  [1  zj1  •••Zj^3-1^]7’  and 


(2) 


H  A 


r  mo,  0) 

M  1,0) 


h(0,l) 

MU) 


Lh(*i  - 1,0)  mm -1,1) 

Define  a  vector  by  stacking  the  columns  H  as  follows  : 


Mo,  M-i) 
Mi,  *2-1) 

M*l-  1,*2-  l) 


(3) 


h  A 


r  hj  i 
1*2 

h*,J 


(41 


where,  hi  denotes  the  t'h  column  of  H.  Next,  let  the  prescribed  space-domain  impulse  response  matrix  be  denoted 
as, 


X  A 


r  *(o,o) 
*(1,0) 


*(0,1) 

*(1,1) 


-*(*1  -  1,0)  *(*1-1,1) 

A  [xixj  •••  x*,] 


*(0,  *2  1)  1 

*(1>  *2  —  1) 

*(*1  —  1 ,  *2  ~  1)J 


and  the  corresponding  vector  be  formed  as, 


x  A 


*2 

Lx*.J 


(5a) 

(56) 

(6) 


In  this  Section,  the  following  2-D  least-squares  synthesis  problem  is  addressed  : 
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Given  the  2-D  spatial  impulse  response  matrix  X,  estimate  p  and  q  by  optimizing  the  following  error  criterion, 


min||e||J  A  ||x  -  h||2  with,  p(0,0)  =  1,  where, 

Q.P  = 

(7a) 

q  A  (9(0,0)  9(0,1)  •••  9(m.»»3)lT  and 

(76) 

p  A  [p(0,0)  p(0,l)  p(ml,m2)]T 

(7c) 

This  problem  is  nonlinear  in  p  and  standard  gradient-based  optimization  algorithms  have  been  used  for  1-D  as 
well  as  for  2-D  designs  [1,  2,  16].  But  these  generic  algorithms  do  not  make  effective  use  of  the  matrix-structures 
inherent  in  this  particular  problem  and  they  are  known  to  be  sensitive  to  initialization.  Several  sub-optimal 
algorithms  based  on  linearization  of  the  true  non-linear  criterion  have  also  been  proposed  [2,  11,  16,  17].  In  this 
work,  the  exact  fitting  criterion  defined  in  (7)  will  be  theoretically  decoupled  into  a  purely  linear  problem  for  q 
and  a  non-linear  problem  for  p.  Furthermore,  the  non-linear  criterion  will  be  shown  to  possess  a  quasi-linear 
relationship  to  the  unknown  denominator  coefficients.  This  will  lead  to  the  formulation  of  an  iterative  algorithm 
for  its  minimization. 


III.  Design  With  Separable  Denominator  and  Non-Separable  Numerator 
In  this  case,  the  2-D  rational  transfer  function  can  be  written  as, 

,mi-l  V'mj-l  „t; 


21  ’ 22  E£o  E”=o  d(j w 


Define, 


c  A  [c(0)  c(l)  ...c(mi)]T  and 

d  A  [d(0)  d(l)  ...d(m2)]T. 


(8a) 


(8b) 

(8c) 


Multiplying  both  sides  of  (8a)  by  YIT-o  c(02i  *  E^-o  ^O)2^  and  equating  the  coefficients  of  the  same  powers  of 

2'>  [5], 


[Df  0  C^jh  =  q  (9a) 

(Df  0  CT]h  =  0  (96) 

[Dt  0  Cf]h  =  0  (9c) 

[Dt  0  CT]h  =  0  (9d) 

[It,  ®  CT]h  =  0  (9e) 

[DT0lt,]h  =  0  where,  (9/) 


c(mi) 

0 

0 

c(mi  -  1) 

c(m0  ••• 

0 

c(0) 

c(l)  ... 

c(mj) 

0 

c(0)  ••• 

c(mi  -  1) 

0 

0 

C(0)  . 
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N. 


where,  Itl  €  and  h,  £  JRtjx4a  are  identity  matrices  and  ®  denotes  the  Kronecker  product  [9].  If  h,  c 

and  d  are  known,  the  numerator  vector  q  can  be  calculated  using  (9a).  But  in  practice,  fa,  c  and  d  need  to  be 
estimated  from  the  prescribed  response  x.  If  fa  is  replaced  by  the  prescribed  x  in  (9b)-(9f),  the  right  hand  sides 
will  not  be  equal  to  zero.  Instead,  it  will  result  in  the  following  equation  errors  : 


(Df  ®  CT]x  =[Df  ®  CT][h-t-  e]  =  [D?  ®  CT]e  A  (11a) 

[DT  ®  C?]x  =[Dt  ®  Cfl(h+  e]  =  [Dt  ®  Cf]e  A  e’f  (116) 

[Dt  ®  CT]x  =[Dt  ®  CT](h  +  e]  =  [Dt  ®  CT]e  A  e*,  (11c) 

(Itj  ®  CT]x  =p*a  ®  CT][h  +  e]  =  [It3  ®  CT]e  A  (lid) 

[Dt  ®  lkl]x  =[Dt  ®  /*,][ h  +  e]  =  [Dt  ®  /tl]e  A  e*,.  (lie) 


In  (lla)-(lle),  the  fact  that  x  =  fa-1-  e  and  the  orthogonal  relationships  in  (9b)-(9f)  have  been  utilized.  These 
equations  show  the  relationships  between  the  fitting  error  e  and  equation  errors.  As  in  case  of  1-D  EFM  [7],  in 
order  to  minimize  ||e||*  an  inverse  relationship  of  the  form, 

e  =  W(c,d)ee*  (12) 

need  to  be  found.  The  matrix,  W(c,  d)  needs  to  be  constructed  using  c  and  d. 

The  problem  of  determining  the  denominator  coefficient  vectors  c  and  d  is  essentially  equivalent  to  the 
search  for  (ifcikj  -  minis)  linearly  independent  vectors  orthogonal  to  h.  These  orthogonal  basis  space  must  be 
dependent  on  the  elements  in  c  and  d  only.  Equation  (9a)  clearly  shows  that  Di  ®  Ci  €  itjjbj  x  mi  m2  can  not  be 
orthogonal  to  fa.  On  the  other  hand,  (9b)-(9f)  demonstrate  that  the  matrices  Dj®C,  D®  Ci ,  D®  C,  I*3  ®  C  and 
DT  ®  are  indeed  all  orthogonal  to  h.  But  summing  the  respective  number  of  columns  of  these  five  full-rank 
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matrices,  the  total  turns  out  to  be  (3ibiJk2  —  niimj  -  fcimj  —  k2mi)  vectors  of  length  *1*2  each.  Hence,  these  set 
of  matrices  can  not  all  be  linearly  independent  of  each  other.  In  [5],  the  matrices,  Di  ®  C,  D  ®  Ci,  D  ®  C,  as 
formed  in  (9b)-(9d),  respectively,  were  utilized  to  form  an  inverse  relationship  as  required  by  (12).  These  set  of 
matrices  do  contain  (kxk2  -  m\mi)  linearly  independent  vectors  of  length  k\k 2,  but  unfortunately,  they  are  not 
orthogonal  to  each  other. 

It  may  be  noted  here  that,  in  a  previous  2-D  generalization  of  EFM  [4],  nplete  spatial  fitting  error 

criterion  was  not  formulated.  In  a  later  work,  it  was  partially  formulated  in  equ  4)  of  [5],  but  no  algorithm 

was  presented  for  minimizing  that  criterion  with  respect  to  the  unknown  parai,.  .era.  Instead,  in  both  those 
works,  two  separate  criteria  were  minimized  independently.  Specifically,  (9e)  and  (9f)  were  used  in  [4  5)  to 
estimate  c  and  d  using  two  independent  optimizations.  But  It,  ®  C  and  D  ®  /*,  contain  (k\k2  —  k2ml)  and 
(k\k2  —  m2ky )  linearly  independent  vectors,  respectively.  The  entire  (ibi —  mimj)  dimensional  vector-space 
orthogonal  to  h  was  not  optimized  simultaneously  w.r.t  c  and  d.  It  is  not  apparent  if  the  optima  of  these  separate 
criteria  are  identical  to  those  of  the  true  2-D  criterion.  In  Subsection  IV,  a  new  set  of  or  agonal  vectors  will  be 
constructed  which  will  lead  to  the  formulation  of  an  exact  2-D  spatial  fitting  error  criterion  that  can  be  optimized 
simultaneously  w.r.t.  c  and  d. 

IV.  Formulation  of  the  Orthogonal  Basis  Space  : 


According  to  orthogonality  principle  [15],  the  fitting  error  e,  at  minimum,  ought  to  be  orthogonal  to  the 
‘estimated’  h  that  minimizes  the  error.  It  is  also  desirable  to  have  the  resulting  error  criterion  dependent  on  the 
denominator  coefficients  only.  To  meet  these  requirements  two  Vandermonde  matrices  are  formed  as  follows, 
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1  ■ 

'  1 

1  • 

h  . 
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••  9m, 

jkrl  . 
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where,  U  =  eJW,;u>i, »  =  l,...,n»i  and  ty  =  ei9,-,0i,i  =  I,..., m2  be  the  roots  of  the  polynomials  C(z\)  = 
T2T=q  <(»>r‘  and  D(z2)  =  <*(*)•*£" \  respectively.  Hence,  by  construction, 


CrT  =  0  and  (I4a) 

DtQ  =  0  (146) 


Furthermore,  using  (9e)  and  (9f), 

[Qt  ®  CT]h  =  [Qt  ®  Ip  ®  CT]h  =  0  and  (15o) 

[DT®TT]h  =  [I®Tt][Dt  I]h  =  0.  (156) 


The  orthogonality  relationships  in  (9d),  (15a)  and  (15b)  demonstrate  that  the  three  matrices  Q  ®  C,  D  ®  T 
and  D®  C  together  constitute  (k2k2  -  mi  m2)  dimensional  vector  space  orthogonal  to  h.  Interestingly,  these 
matrices  are  not  only  formed  with  linearly  independent  columns  they  are  also  mutually  orthogonal  to  each  other. 
This  orthogonality  claim  can  be  easily  substantiated  as  follows  : 

[Qt®Ct][D®T]  =  [QtD®CtT]  =  [0®0]  =  0, 

[Qt®Ct][D®C]  =  [QtD®CtC]  =  [0®CTC]  =  0 
[Dt®Tt][D®C]  =  [DtD®TtC]  =  [DtD®0]  =  0. 
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and 


(16a) 

(166) 

(16c) 


It  should  also  be  mentioned  here  that  the  matrices  T  and  Q  are  useful  only  in  this  intermediate  stage  of  deriving 
the  2-0  error  criterion  and  they  will  not  be  needed  in  the  final  optimization  steps. 

If  the  vector  h  in  (15)  is  replaced  by  x,  then  similar  to  (11),  the  following  equation  errors  are  formed  : 

/Qt®Ct\  /Qt®Ct\ 

I  Dt®Tt  x  =  Dt®Tt  e  A  eef.  (17) 

\D  T®CTJ  \dt®Ct/  “ 

The  optimized  e  must  be  orthogonal  to  the  estimated  h,  whereas  the  three  matrices,  Q  ®  C,  D  ®  T  and  D  ®  C, 

were  shown  to  be  orthogonal  to  h  in  (9d),  (15a)  and  (15b),  respectively.  Hence,  e  can  be  constructed  as  a  linear 
combination  of  the  columns  of  these  orthogonal  set  of  matrices,  i.e., 


e  =  (Q  ®  C  D  ®  T  D  ®  C  )f 


(18) 


where, 


f  A  [fl  /j  /jitj-nimjf 


(19) 


is  a  vector  of  constants  which  are  to  be  determined.  Using  this  form  of  e  in  (17),  the  equation  error  can  be  written 
as, 

CTC 


(qt®ct\  /QtQ®CtC  0  0  \ 

Dt®Tt  x=  0  DtD®TtT  0  fAe,,. 

\Dt®C  t)  \  0  0  dtd®ctc/  ~ 


(20) 


The  matrix  on  the  r.h.s.  is  square  block-diagonal  with  square  diagonal  blocks  and  hence  it  can  be  inverted  to 
uniquely  determine  the  vector  of  constants  f  as, 
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'(QtQ)-1  ®  (CTC)-1  0  0 

0  (DtD)-1®(TtT)"1  0 

0  0  (DtD)-1®(CtC)- 

/  (QTQ)_lQT  ®  (CTC)-lC^ 

=  I  (DtD)-1Dt®(TtT)-1 


.)( 


lcT\ 

-*Tt  X. 
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\(DtD)-1Dt®(CtC  )- 
Using  this  expression  of  f  in  (18)  the  fitting  error  becomes, 

e  =  [Pq  ®  Vc  +  ®  Vt  +  Vd  ®  Vc\* 


Qt®Ct\ 
Dt®Tt|x  (21a) 


(216) 


(22) 


where,  V\.\  denotes  the  Projection  matrix,  e.g.,  Vc  A  C(CTC)_1CT.  Unfortunately,  this  error  vector  is 
dependent  on  T  and  Q  which  must  be  removed.  According  to  (14),  the  matrices  C  and  D  are  orthogonal  to  T 
and  Q,  respectively.  Furthermore, 


rank(T)  +  rank(C)  =  ki 
rank(Q)  +  rank(D)  =  *2 


and 


(23a) 

(236) 


Hence,  using  a  theorem  on  Projection  matrices  [14], 


Vc  +  Vt  =I*t 
'Pd  +  Vq  =1  ki¬ 


rn 


and 


(24a) 

(246) 


Using  these  relationships  in  (22),  Vt  and  Vq  can  now  be  replaced  and  the  error  vector  can  be  written  as, 


e  =  ~  VD)  ®  Vc  +  Pd  ®  (It,  -  VC)  +  Pd  ®  Pc]x  (25a) 

=  Pt,® Pc  -  Pd® Pc  +  Pd  ®  It,  -  Pd®Pc  +  Pd®Pc)x  (256) 

=  (It,  ®  Pc  +  Pd  ®  It,  -  Pd  ®  Pc]x.  (25c) 

Note  that  in  this  final  form  of  the  error,  there  is  no  dependence  on  either  T  or  Q.  Hence,  the  error  criterion  for 
determining  the  denominator  coefficient  vectors  c  and  d  can  now  be  written  as, 

mine||(c,  d)||2  =  min  (xT(It3  ®  Pc  +  Pd  ®  It,  -  Pd  ®  Pc]x)  (26) 

ctd  c,d 


Equations  (26)  and  (9a)  represent  the  desired  decoupled  criteria  for  determining  the  denominator  and  numerator 
coefficients,  respectively.  Optimization  of  (26)  would  produce  the  optimal  c  and  d,  denoted  as,  c°  and  d°, 
respectively.  Letting  e°  denote  the  minimized  error  corresponding  to  the  optimum  denominator  coefficients,  the 
optimum  spatial-response  vector  h  can  be  found  from, 

h°  A  x  -  e°.  (27) 

This  h°  can  then  be  used  in  (9a)  to  obtain  the  optimal  numerator  vector,  q°. 

Analyzing  the  criteria  in  (26)  it  is  apparent  that  the  first  two  terms  are  the  orthogonal  projections  of  the  data 
x  on  to  the  parameter  spaces  of  each  of  the  two  spatial  dimensions.  The  third  term  is  the  orthogonal  projection 
common  to  both  dimensions  but  is  subtracted  once  because  the  common  (or,  joint)  projections  have  already  been 
included  once  in  each  of  the  first  two  terms.  It  is  very  interesting  to  note  that  this  criterion  is  quite  analogous 
to  the  standard  formula  of  the  Probability  of  Union  of  two  subsets.  It  may  be  emphasized  here  that  this  form  of 
the  error  criterion  is  not  only  mathematically  appropriate  it  is  intuitively  appealing  as  well  and  this  form  of  the 
2-D  error  criterion  was  not  arrived  at  in  any  of  the  previous  generalizations  of  EFM  [4,  5].  With  further  algebraic 
manipulations,  the  error-vector  e  can  also  be  shown  to  be  related  to  both  the  denominator  vectors  c  and  d  in  a 
9«ast-linear  manner  as, 


e(c,d)  A  (  (Ik,  —  PD)  ®  W(c))X* 

(W(d)®I*,)X2) 

(:)■ 

(28a) 

where, 

W(c) 

A  C(CTC)_1 

and 

(286) 

W(d) 

A  D(DTD)_l 

• 

(28c) 

X*  and  X2  are  constructed  from  the  elements  in  X  as  [9,  17], 
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where,  the  (l,  k),h 

term  of  X<  is  formed  as, 

Xi(l,k)  A  Xi(l-k 
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Equation  (28a)  is  one  the  key  results  derived  in  this  Section.  It  clearly  shows  that  both  the  unknown  coefficient 
vectors  c  and  d  appear  simultaneously  in  a  ‘quasi-linear’  relationship  w.  r.t.  the  true  error  vector.  This  quasi-iinear 
relationship  can  be  exploited  for  simultaneous  optimization  of  the  criterion  in  (26)  w.r.t.  c  and  d.  The  algorithm 
is  similar  in  flavor  to  the  ones  in  [3-5,  8,  13],  except  that  both  the  denominators  are  optimized  and  estimated 
simultaneously.  Specifically,  the  algorithm  iteratively  minimizes  the  f2-norm  of  the  error  vector  formed  in  (26) 
in  two  phases.  In  Phase- 1,  the  W  matrices  are  treated  as  constants  and  are  formed  using  the  estimates  of  c  or 
d  obtained  at  the  previous  iteration.  In  Phase-2,  the  estimates  are  improved  upon  by  setting  the  gradient  of  the 
complete  error-norm  to  zero.  The  iterations  are  initialized  by  setting  c  =  [1  0  ...  0]T  and  d  =  [1  0  ...  0]T. 
The  iterations  are  continued  until  the  changes  in  the  estimates  in  successive  iterations  become  very  small.  It  may 
be  noted  here  that  extensive  simulation  experience  in  1-D  [7,  8,  13,  14]  as  well  as  for  2-D  cases  [13,  15]  indicate 
that  Phase- 1  itself  produces  very  good  estimates  of  the  filter  coefficients  and  in  most  cases,  there  may  not  be  any 
need  for  invoking  Phase-2  at  all.  It  may  be  noted  here  that  in  [4,  5],  the  complete  error  e(c,d)  in  (26)  was  not 
optimized. 

Symmetric  Spatial  Response  -  A  Special  Case  :  Many  2-D  filters  used  in  image  processing  are  symmetri¬ 
cally  shaped  in  the  spatial  domain.  Some  notable  examples  are,  Gaussian  and  Circular  filters.  In  designing  such 
spatially  symmetric  2-D  filters,  the  methods  in  [4,  5]  sometimes  produced  slightly  different  sets  of  denominator 
polynomials.  Hence,  the  estimated  spatial  response  may  not  possess  the  desired  symmetry.  This  problem  may 
be  attributed  to  separate  estimation  of  the  individual  denominators.  In  the  proposed  approach,  both  the  denom¬ 
inators  are  optimized  simultaneously  by  minimizing  the  entire  error  in  (28a).  If  necessary,  the  desired  symmetry 
may  be  imposed  by  setting,  c  =  d  in  (28a)  at  the  outset.  For  this  special  but  very  important  special  case,  (28a) 
would  have  the  following  form  : 

e(c)  A  [((I*a  -  PC)®Wtl(c))Xl  +  (Wta(c)®Itl)X2]c,  (31) 

where,  the  subscripts  of  W  denote  leading  dimensions  which  may  be  unequal.  Minimization  of  the  norm  of  the 
error  in  (31)  will  result  in  a  single  set  of  optimal  coefficients  meant  for  both  dimensions.  This  is  one  of  the  major 
advantages  of  the  proposed  approach  over  the  ones  in  [4,  5]  where  separate  optimization  in  each  domain  does  not 
necessarily  guarantee  identical  denominator  coefficients  in  both  domains. 

V.  Simulation  Results 

In  order  to  demonstrate  the  effectiveness  of  the  proposed  algorithm,  the  results  of  the  design  of  a  Gaussian  Filter 
are  given  here.  The  spatial  response  of  a  quarter  plane  Gaussian  filter  defined  over  the  first  quadrant  is  given  by 

H(i,j )  =  0.256322  et-#103203<<<-«>’+tf -<>’>], 

where,  ( i,j )  G  5/  and  the  support  S/  is  given  by  5/  =  {(i,j)  |  0  <  »  <  14;  0  <  j  <  14).  The  true  or  the 
desired  spatial  response  is  shown  in  Fig.  1.  Note  that  the  spatial  response  is  symmetric  around  its  center  point. 
Fig.  3  through  5  show  the  estimated  responses  for  filter  orders  (mi  =  m2),  4,  5  and  6,  respectively.  The  results 
were  obtained  by  minimizing  the  norm  of  the  error  vector  in  (31)  with  different  orders.  The  algorithm  converged 
in  5-7  iterations.  The  plots  clearly  show  that  the  estimated  spatial  impulse  responses  match  the  desired  one 
quite  closely  and,  as  can  be  expected,  the  match  improves  as  the  filter  order  increases.  With  sixth-order  the 
difference  between  the  true  and  the  estimated  response  is  almost  negligible.  The  closeness  between  the  true  and 
the  estimated  response  was  also  measured  in  terms  of  the  ratio  of  the  power  of  the  true  response  to  that  of  the 
errors  in  each  case.  The  ratios  were  found  to  be  about  41.2dB,  61.4dB  and  8G.5dB  for  filter  orders  4,  5  and  6, 
respectively.  Simulations  with  other  forms  of  2-D  filters  also  showed  similar  performance. 
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VI.  Conclusion  and  Future  Work  : 

An  optimal  2-D  IIE  filter  design  method  has  been  presented.  The  algorithm  is  a  2-D  extension  of  an 
existing  optimal  1-D  approach.  The  2-D  model-fitting  criterion  has  been  decoupled  into  a  linear  and  a  nonlinear 
sub-problems.  The  non-linear  part  has  been  shown  to  possess  a  quasi-linear  relationship  with  the  unknown 
denominator  coefficients.  The  algorithm  simultaneously  optimises  the  coefficients  in  both  dimensions.  Regarding 
future  work,  it  may  be  noted  that  similar  to  EFM  [3],  the  proposed  algorithm  is  also  applicable  for  strictly-proper 
designs  only,  albeit  in  2-D.  Recently,  an  optimal  1-D  algorithm  (OM)  which  is  applicable  for  any  genera)  system 
with  arbitrary  number  of  poles  and  zeros,  has  been  presented  in  [14],  Unlike  EFM,  the  general  1-D  algorithm 
in  [14]  formulates  the  criterion  entirely  differently.  It  shows  explicitly  that  the  true  error  is  linearly  related  to 
the  numerator  coefficients  whereas  the  denominator  is  nonlinearly  related.  The  possibility  of  extending  this  work 
for  designing  2-D  filters  with  any  arbitrary  numbers  of  denominator  and  numerator  orders  is  presently  under 
investigation. 
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Fig.  1  :  The  True  Spatial-Domain  response  for  a  Fig.  2  ;  The  Estimated  Spatial-Domain  response 

15  x  15  Gaussian  Filter.  with  mi  =  «i  =  4 


Fig.  3  :  The  Estimated  Spatial-Domain  response 


Fig.  4  :  The  Estimated  Spatial-Domain  response 


